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Abstract. We prove the so-called inverse conjecture for the Gowers U s+ ■ 
the case s = 3 (the cases s < 3 being established in previous literature). That is, we 
show that if / : [N] — > C is a function with |/(n)| ^ 1 for all n and ||/||[/ 4 ^ 8 then 
there is a bounded complexity 3-step nilsequence F(g(n)T) which correlates with /. 
The approach seems to generalise so as to prove the inverse conjecture for s ^ 4 as 
well, and a longer paper will follow concerning this. 

By combining the main result of the present paper with several previous results 
of the first two authors one obtains the generalised Hardy-Littlewood prime-tuples 
conjecture for any linear system of complexity at most 3. In particular, we have an 
asymptotic for the number of 5-term arithmetic progressions pi < P2 < P3 < P4 < 
P5 ^ N of primes. 
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notation. By a 1-bounded function on a set X we mean a function / : X — > C 
with ^ 1 for all x G X. If the cardinality \X\ of X is finite and non-zero, we 

write K x< zxf{x) for J2 x ex f( x )- Throughout the paper the letter M will refer 

to a large positive "complexity" quantity, normally introduced in each statement of a 
lemma, proposition or theorem. The letters c and C are reserved for absolute constants 
with < c < 1 < C; different instances of the notation will generally denote different 

l 
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absolute constants. If x G R we will write [^J for the greatest integer less than or equal 
to x, and {x} := x — [x\ . If iV is a positive integer then we write [N] :— {1, . . . , N}. 

1. Introduction 

This paper concerns a special case of a family of conjectures named the Inverse 
Conjectures for the Gowers norms by the first two authors. For each integer s ^ 1 
the inverse conjecture GI(s), whose statement we recall shortly, describes the structure 
of 1-bounded functions / : [N] — > C whose (s + l)st Gowers norm is large. 

These conjectures together with a good deal of motivation and background to them 
are discussed in [TQl [TTJ [13] . The conjectures GI(1) and GI(2) are already known, the 
former being straightforward application of Fourier analysis and the latter being the 
main result of [UJ. The aim of the present paper is to establish the first unknown case, 
that of GI(3), using what is in essence a method which seems to generalise to prove 
GI(s) in general. 

We have taken advantage of some shortcuts and explicit calculations that are specific 
to the s = 3 case, hoping that this will render the paper somewhat appetising as an 
hors d'oeuvres for the general case. The general case will, furthermore, be phrased in 
the language of non-standard analysis since this provides a very effective framework 
in which to manage the complicated hierarchies of parameters that appear here. We 
offer the present paper to those readers who are not immediately comfortable with the 
nonstandard language; it also serves as an illustration of the point, to be made in the 
longer paper to follow, that our arguments may be taken out of the choice-dependent 
realm of nonstandard analysis and, in particular, can lead to effective bounds (albeit 
extremely weak ones). 

We begin by recalling the definition of the Gowers norms. If G is a finite abelian 
group and if / : G — > C is a function then we define 

\\f\\ uHG) := (E xM _ hkeG A hl ...A hk f(x)) 1/2 \ 

where A h f is the multiplicative derivative 

A fc /(x) :=f(x + h)J(x). 

In this paper we will be concerned with functions on [N], which is not quite a group. 
To define the Gowers norms of a function / : [N] — > C, set G := Z/iVZ for some integer 
N ^ 2 k N, define a function / : G ->■ C by f(x) = f(x) for x = 1, . . . , N and f(x) = 
otherwise, and set U/Hi/fenv] := ll/l|[/ fe (G)/l|l[A r ]ll(7 fe (G')' where 1^] is the indicator function 
of [N]. It is easy to see that this definition is independent of the choice of N, and so for 
definiteness one could take N := 2 h N. Henceforth we shall write simply ||/||[/*, rather 
than ||/|| i/fcrjvi, since all Gowers norms will be on [N]. One can show that || ■ H^* is 
indeed a norm for any k ^ 2, though we shall not need this here. 

The Inverse conjecture for the Gowers U s+1 -norm posits an answer to the following 
question. 

Question 1.1. Suppose that f : [N] — > C is a 1-bounded function and let 5 > be a 

positive real number. What can be said if ||/||jyH-i ^5? 

The conjecture made in [T3] is that / must correlate with a certain rather algebraic 
object called an s-step nilsequence. In the light of subsequent work [TU [T5] it seems 
natural to work with a somewhat more general object called a degree s polynomial 
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nilsequence. We recall now the bald definition; for much more motivation and examples, 
see the introduction to 



Definition 1.2 (Polynomial nilsequence). Let G be a connected, simply-connected 
nilpotent Lie group. By a filtration G, of degree s we mean a nested sequence G = 
G(o) = G ( i) D Gp) 12 ■■ ■ 12 G (s+ i) = {id} with the property that [G (i) , G (j) ] C G {i+j) . 
By a polynomial sequence adapted to G, we mean a map g : Z — >■ G such that 
dhi ■ ■ ■ df^g £ for all /ii, . . . , /ij G Z, where dhip(n) := ^(n + h^ipin)' 1 . Let r ^ G 
be a discrete and cocompact subgroup, so that the quotient G/r is a nilmanifold, and 
assume that each of the Ga\ are rational subg roupfl If F : G/r — y C is a 1-bounded, 
Lipschitz function then the sequence (F(g(n)T)) ne i is called a polynomial nilsequence 
of degree s. 

Remark. An important example of a filtration of a nilpotent group is the lower central 
series Go ^ Gi 3 G2 ^ • • • , in which Go = Gi = G, and Gj+i := [G, Gj] for z ^ 1. It is 
classical (see, for example, [3]) that this is a filtration of degree s whenever G is s-step 
nilpotent. This is the minimal example of a filtration, since for any other filtration G. 

one has G.; C G^y 

Remark. An important fact about polynomial sequences adapted to a filtration G, 
is that they form a group under pointwise multiplication: see [20] or Proposition 
6.2]. A polynomial sequence g : Z — y G can also be uniquely expressed as a Taylor 

expansion g(n) = j j} . . . gs for some g^ E G(j) for i = 0, 1, . . . , s, where ( n ) is the 
usual binomial coefficient; see [TH Section 6]. 

Remark. If G admits a filtration of degree s then, as we remarked above, G must 
be s-step nilpotent. On the other hand, the degree can exceed the step by an arbitrary 
amount. For instance, if P : Z — y R/Z is a polynomial of degree d ^ 1, then the function 
e(P(n)) := e 2?nP ( n ) is a polynomial nilsequence of degree d, despite being associated to 
a nilmanifold G/r = R/Z of step just 1. 

Roughly speaking, the inverse conjecture GI(s) asserts that a 1-bounded function / 
has large [7 s+1 -norm if and only if it correlates with a degree s nilsequence. However, 
every aspect of this statement must be quantified in order to make a precise statement. 
The key issue here lies in defining the complexity of a nilsequence, a matter which was 
addressed in some detail in [HI Sec 2]. In this paper (fortunately) we can take a much 
rougher approach. If 5 > is some parameter we shall simply say that the complexity 
of a polynomial nilsequence (F(g(n)T)) n< zz is 0^(1) if the following list of objects are 
bounded in a way that depends only on 5: 

• dimG; 

• The rationality of some Mal'cev basis X for G/r (see [HI Definition 2.4]); 

• The rationality of each subgroup G(a in the filtration (see [TH Definition 2.5]); 

• The Lipschitz norm of F, measured using the metric defined in [T4"l Definition 
2.2]. 



One may define rationality topologically, by stipulating that the Gu\ are connected Lie subgroups of 
G and that T n is a cocompact subgroup of Gu\ . Some readers may wish to think more concretely, 
in terms of the existence of a Mal'cev basis as in [TJ] Definition 2.1]. 
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We do not encourage the reader to read those definitions in detail at this stage. The 
important thing to note is that nothing is said about the polynomial sequence g, other 
than that it is adapted to the filtration G m . 

We may now state the Inverse Conjecture for the Gowers £7 s+1 -norm, GI(s), properly. 

Conjecture 1.3 (GI(s)). Suppose that f : [N] — > C is a 1-bounded function and that 
\\f\\u s + 1 ^ Then there is a degree s polynomial nilsequence (F(g(n)T)) n€ z of com- 
plexity 0$(1) such that \K ne [ N }f(n)F(g(n)T)\ 3>£ 1. 

As hinted earlier, this is not quite the formulation of GI(s) originally given in [T3J Sec- 
tion 8]. There, it was posited that / correlates with a lineal nilsequence (F(g n T)) n& z- 
One might now relabel this the strong inverse conjecture. In the longer paper to come 
we will show how this in fact follows from Conjecture 11.31 In the special case of the 
[7 4 -norm under consideration here, it is possible to verify the strong inverse conjecture 
quite directly by inspection, and we sketch this in Appendix |F] We would, however, like 
to impress upon the reader our opinion that Conjecture 11.31 is the most natural one, a 
viewpoint that became apparent to the first two authors in the light of our paper [14]. 
Unfortunately [13] was written before that paper and hence operates under the assump- 
tion of the strong inverse conjecture. Relatively simple changes would be required to 
make all of the arguments there work under the assumption of Conjecture 11.31 however, 
the key issue being §11 of that paper. 

The evidence for the inverse conjectures prior to the present work was a "local ver- 
sion" due to Gowers [8], its truth in the cases s = 1 and 2 (see [11]) as well as the truth 
of analogues of the conjecture in both ergodic theory [18j [28] and in the "finite field 
model" in which [N] is replaced by F n for some small prime field F [TJ |2"B] . 

It is also known that this conjecture is necessary, in the following sense. 

Proposition 1.4 (Necessity of inverse conjecture). Suppose that f : [N] — > C is a 
1-bounded function, that (F(g(n)T)) ne z is a polynomial nilsequence of degree s and 
complexity Og(l), and that \E ne [^f(n)F(g(n)T)\ ^ 5. Then \\f\\ Us +i 1- 

There is currently no proof of this written in the literature. In the case of linear 
nilsequences (F(g n r)) n£ % there are two different (albeit related) proofs in the literature: 
one in [Tlj Proposition 12.6] and the other in [T31 Section 11]. The second of these 
proofs would generalise rather easily to the more general setting of degree s polynomial 
nilsequences (F(g(n)T)) ne z, the key issue being to note that [131 Lemma E.4] is true 
for the values (g(n + uj ■ h)T) ue ^ 0t iy+i, this being essentially [HI Proposition 6.5]. The 
reader will doubtless be relieved to hear that we recently discovered a very short proof of 
Proposition ll.4[ and we give this in Appendix [G] Note, however, that this proposition 
is included for motivation and interest only, and is not actually required in this paper. 

Here, then is the main result of our paper. 

Theorem 1.5 (GI(3)). The inverse conjecture for the U 4 -norm, GI(3), is true. 

As already remarked, in Appendix [F] we will also establish the strong form of the 
inverse conjecture for the [7 4 -norm, in the form given in [131 Section 8]. 

By combining this result with the previous results in [T3"l [T5] we obtain a proof of 
what was referred to in [T3] as the generalised Hardy-Littlewood conjecture for linear 
systems of complexity at most 3. In particular we have the following. 



We remark that a linear nilsequence is not the same thing as a degree 1 nilsequence; a typical linear 
nilsequence on an s-step nilmanifold will have degree s. 
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Theorem 1.6. The number of quintuples of primes pi < p 2 < Ps < p^ < p^ ^ N in 



We refer the reader to |T3] for further discussion. Several further applications of the 
GI(s) conjectures will be given in a forthcoming paper of the first two authors [16J. 
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Foundation. TZ is supported by ISF grant 557/08, an Alon fellowship, and a Landau 
fellowship of the Taub foundation . All three authors are very grateful to the University 
of Verona for allowing them to use classrooms at Canazei during a week in July 2009 
when this work was largely completed. 



In this section we outline the argument we use to establish the inverse conjecture for 
the [7 4 -norm. 

It is easy to show, and well-known, that if ^ 5 then there are 3> 5 C N values 

of h for which A^/(n) := f(n + h)f(n) has [7 3 -norm at least 6 . Applying GI(2), it 
follows that for all these h we have 



where Xh{n) is a 2-step nilsequence (with complexity bounded uniformly in h). 

Very roughly speaking, the aim is to show that these 2-step nilsequences "line up" 
in such a way that they may be interpreted as the derivatives of a single 3-step object. 
To make this work and for ease of exposition it is convenient to assume that Xh{n) is in 
fact equal to e(^(n)), where iph(n) is a bracket quadratic phase: a sum of terms of the 
form ain[a 2 n\, a 3 n 2 and a 4 n. The link between these objects and 2-step nilsequences 
was explored in [TTJ Section 10] and will be recalled later in this paper. For the pur- 
poses of this discussion let us suppose that iph{n) = atn \_Ph n \ ; this is something of a 
simplification of the true situation. 

Here is a rough outline of the main steps we shall be taking to control the dependence 
of ah and j3h on h. Suppose that (12.11) holds with Xh( n ) = e(oih,n\_/3hn\). 
Step 1 (Reducing the h- dependence) We may assume (possibly after refining the set of 

h and modifying ah and (3 h somewhat) that (3 h does not depend on h. 
Step 2 (Approximate linearity of /i-dependent frequency) We may assume (possibly 
after refining the set of h again) that ah is approximately equal to a bracket 
linear form ex{9[h} + ■■■ + d {B' d h} + 9h. 
Step 3 (Symmetry argument) Following Step 2, Xh(n>) is essentially e(iph(n)) with the 
phase iph{n) being of the form T(h, n, n), where T(n\, n 2 , n^) is a sum of terms 
of the form {9\ni}9 2 n 2 L^3 n 3j • Not every such function T(h,n,n) can be ob- 
tained as the "derivative" of a 3-step object, however, and in order to make 
this assertion we need some additional symmetry properties of the "generalised 
trilinear form" T(n 1; n 2 , n 3 ). 




2. An outline of the proof 



E n A h f(n)xh(n)\ > 1 



(2.1) 
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It may be of some interest to make a comparison between this strategy and that used 
in the proof of the [7 3 -inverse theorem [IT]. If ||/||[/ 3 ^ 8 then for many h we have, once 
again, 

\E n A h f(n) Xh (n)\ > 1, 

but now Xh{ n ) may be assumed to be nothing more complicated than a linear phase 
e{a.hn). The argument runs roughly as follows: 

Step 2' (Approximate linearity of frequencies) At the possible expense of passing to a 
subset of the h, the frequencies ah are approximately "bracket-linear" in h, as 
above; 

Step 5" (Symmetry argument) Following Step 2, Xh( n ) is essentially e(^(n)) with the 
phase iph{ n ) being of the form T(h,n) where T (71,1,712) is a sum of terms of 
the form {0ini}d 2 n2- Not every such function T(h,n) can be obtained as the 
"derivative" of a 2-step object, however, and in order to make this assertion we 
need some additional symmetry properties of the form T{ni,ri2)- 

We note that Step 2 7 is essentially due to Gowers [8l Chapter 7], although one must 
apply a little extra geometry of numbers to get the precise conclusion we hint at here. 
Step 3' is due to the first two authors and is the main new result of [JTJ, specifically 
Lemma 9.4 of that paper. Note that Step 1 in the outline above did not feature at all 
in the proof of the ^-inverse theorem and it is new to this paper. 

Let us say a few words about how Steps 1, 2 and 3 are accomplished. The key to 
almost all of our analysis is a straightforward adaption of a fundamental idea of Gowers 
[7], which proceeds from the assumption that 

\E n A h f(n)~x~M\ » 1 (2.2) 

for many h and draws a conclusion involving just the Xfe( n )> an d not the function /. 
This argument is valid for any bounded functions Xh( n ) and we give it in §6j 
The conclusion of that argument is that 

^ne[N]XhA n )Xh 2 {n + h- h 4 )xh 3 (n)xh 4 (n + h - /14) > 1 (2.3) 

for many additive quadruples hi, h 2 , h 3 , h±, that is to say quadruples satisfying hi + h 2 = 
h 3 + h A . 

Steps 1,2 and 3 all involve interpreting this in the case that Xh{n) is a 2-step object 
such as a bracket quadratic phase. One way to do this is to visualise 

XhA^XhAn + hi - h 4 )xh 3 ( n )Xh 4 ( n + h - h 4 ) 

as a certain hi, h 2 , h 3 , /independent nilsequence on a product of four nilmanifolds (one 
for each of the hi), in which case ( 12. 3 p states that the underlying polynomial sequence 
Q^mmm^J 1 ) i s f ar f rom equidistributed. This situation may then be studied using 
the distributional results on nilsequences contained in [TJ] in order to draw conclusions 
concerning the dependence on h of "leading order" terms in the Xh{n). 

Steps 1 and 2 really only use the "top-order" structure of (12. 3 p - that is to say the 
shifts h\ — h A are not relevant. To handle Step 3 these shifts cannot be ignored. In the 
general case the treatment of Step 3 will involve another appeal to the distributional 
results on nilmanifolds in [14], but in the case of the f/ 4 -norm a much more hands-on 
approach involving Bohr sets may be employed, and it is this argument that we give 
here. 
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The following deliberately vague discussion may perhaps be helpful. Suppose that 
| = 1 for all n and that A^f ~ Xh (where we are not attaching any real meaning 
to ~). Then we have the "cocycle identity" A h+k f(n) = A h f{n + k)A k f(n), which 
translates to Xh+k{ n ) ~ Xh(n + k)xk(n). Imagining that the shift n i-» n + k does 
not affect the "top-order structure" of Xh(n + k), we have the approximate linearity 
condition 

Xh+k ~ XhXk to top order. 
Roughly speaking, Steps 1 and 2 are concerned with exploiting this rigorously. On the 
other hand we also have the symmetry relation A^A^f = A k Ahf, which suggests that 
AhXk ~ A^Xh] Step 3 may be thought of in terms of exploiting this kind of information. 

3. Almost nilsequences 

In this paper we will be dealing with various objects which are "almost" nilsequences 
but not quite. They can invariably be represented as F(g(n)T) for some function F 
which is only piecewise Lipschitz, the discontinuities being on sets which are somehow 
"polynomial". Rather than formalise these notions, we instead introduce the notion 
of an approximate nilsequence, give some examples, and point out a number of conse- 
quences of the definition. 

Definition 3.1 (Almost nilsequences). Suppose that \1/ : [N] — > C is a 1-bounded 
function and that M > 1 is a complexity parameter. Then we say that \l/ is a degree s 
almost polynomial nilsequence of complexity Om(1) if, for any e > 0, there is a genuine 
degree s polynomial nilsequence \l/ e with complexity Sj£) Af(l) such that E n£ [jv]|^ r (7i) — 
* e (n)| 

Remarks. That is, \1/ can be approximated arbitrarily well, in L 1 , by genuine nilse- 
quences. We will not specify the function Si£> m(1) exactly (and indeed it does not 
make sense to do so, in view of the loose manner in which we have defined complexity). 
The reader should just imagine that there is some fixed function which may be taken 
in this definition and which makes all statements that we make later on true. Let us 
also remark that the non-standard analogue of this definition, which will feature in our 
forthcoming paper on the general case GI(s), is much cleaner and does not involve any 
unspecified complexity parameters S)£j m(1)- 

We make the following easily verified, but rather useful, claim: 

Lemma 3.2 (Algebra properties). 7/$,^/ are degree s almost polynomial nilsequences, 
then their sum $ + ^ and product $ 1 I', and complex conjugate $ are also degree s 
almost polynomial nilsequences (with a slightly different complexity bound Sj£j m(1) on 
the approximants , of course). 

The utility of Definition 13.11 is made clear by the following lemma, which states that 
correlation with almost nilsequences is essentially the same thing as correlation with 
genuine nilsequences. 

Lemma 3.3. Suppose that f : [N] — > C is a 1-bounded function and that 

|E ne[JV] /(n)*(n)| ^ 5 

for some degree s almost polynomial nilsequence \1/ of complexity 0,5(1). Then there 
is a genuine degree s polynomial nilsequence F(g(n)T) of complexity O s 5(1) such that 
\E ne[N] f(n)F(g(n)T)\^d/2. 



8 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



Proof. Simply take e = 5/2 in Definition 13. II and set F(g(n)T) = *ff e (n). 



□ 



A particular consequence, which we shall make use of later, is that it suffices to 
establish Conjecture 11.31 with almost nilsequences instead of genuine ones. 

For 1-step nilsequences there is a further, very helpful, reduction that can be made. 

Lemma 3.4 (1-step correlation). Suppose that f : [N] — > C is a 1 -bounded function and 
that \E, n( z[N]f(n)ty(n)\ is 5 for some degree 1 almost nilsequence \1/ of complexity 0,5(1). 
Then there is a 9 G M/Z such that |E ne nvi/ (n)e(9n)\ ^>s 1- (The implied constants here 
depend of course on the implied constants in the definition of an almost nilsequence.) 

Proof. By the previous lemma we may assume that \I/ is a genuine degree 1 nilsequence 
of complexity 0<$(1), that is to say a sequence of the form (F(na)) n£ z where a G (R/Z) fc 
for some k = Os(i) and F : (R/Z) fc — > C is a function with Lipschitz constant 05(1). 
Standard Fourier analysis (see, for example, [121 Lemma A. 9]) implies that we may 
expand 



where the c m are complex numbers with |c m | = 0,5(1). The result follows quickly from 



The next two lemmas collect together various examples of almost nilsequences. The 
proofs, which are somewhat technical and tedious, are given in Appendix [El 

Lemma 3.5. Suppose that a, (3 G [0, 1] and that M > 1 is a complexity parameter. The 
following are all examples of almost nilsequences of degree 1 and complexity M (1): 

(i) the set of 1-step Lipschitz nilsequences of complexity at most M; 

(ii) the set of characteristic functions lp, where P C [N] is a progression of length 
at least N/M; 

(iii) the set of functions of the form n t— > e(a{f3n}), with a£R and (5 G M/Z; 

(iv) the set of functions of the form n h> e({an}{(3n}) , with a, (3 G M/Z; 

(v) the set of functions of the form n !->■ e(an[(3n\ ), where ||/3||r/z ^ M/N . 

In particular (by Lemma if f : [N] — > C is a 1-bounded function such that 

|E n e[jv]/ (n)ty (n)\ ^ 5, where \1> is one of the functions on the above list, then there 
exists 9 G M/Z such that \E ne [^f(n)e(9n)\ ^s,m 1- 

Lemma 3.6. Suppose that a, (3, 7 G [0, 1] . Then the following are all examples of almost 
nilsequences of degree s ^ 2 and complexity 0(1): 

(i) n 1 — y e({an} (3n) , of degree 2; 

(ii) n 1 — y e({cm}(3n 2 ), of degree 3; 

(iii) n 1 — y e({an}{f3n}^n) , of degree 3. 

Although the proof of this last lemma is little tedious, it is also important in the 
sense that this is the only place in our paper where a 3-step nilsequence is actually 
constructed. 

4. Distributional results concerning nilsequences 

We will rely heavily on the quantitative distribution results concerning polynomial 
nilsequences (g(n)Y) n( z^ established by the first two authors in [H]. There were two 



F(t) 




c m e(m-t) + 0(5/10), 



this. 



□ 
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main results in that paper, the first of which was used in the proof of the second. We are 
aware that the paper [2] is long and somewhat difficult. However, the reader wishing 
to understand the present paper need only be au fait with the statements of the results 
there, which means that she need only read Chapters 1 and 2 of the paper. We will 
assume familiarity with those chapters throughout this paper, and in particular will use 
notation from them without further comment. We will also revisit these results in a 
non-standard setting in the sequel to this paper, in which we will give more detailed 
proofs. 

The first result we refer to gives a criterion for {g{n)T) n ^ N ^ being equidistributed. 
This is [H| Theorem 2.9]. This theorem is a quantitative version of a polynomial 
equidistribution theorem for nilmanifolds. The qualitative version basically claims that 
equidistribution of polynomial sequences is determined on the abelianization G/[G, G]T. 
For linear sequences this is a classical result, and for polynomial sequences the result is 
due to Leibman [21]. 

Theorem 4.1 (Quantitative Leibman dichotomy). [HI Theorem 2.9] Let m,s ^ ; 
< 5 < 1/2 and N ^ 1. Suppose that G/T is an m- dimensional nilmanifold together 
with a filtration G. of degree s and that X is a ^-rational MaVcev basis adapted to G m . 
Suppose that g : Z — > G is a polynomial sequence adapted to G m . If (g(n)r) ng [7v] is not 
S -equidistributed, then there is some k e Z m ' ; where m' := dimG — dim[G,G] is the 
dimension of the horizontal torus of G/T, with \k\ <C §-°m,s(i) an( ^ 

||fc-(7TO^)|| coo[iv] «r '-«, (4.1) 

where n : G — > G/[G,G]T = (IR/Z) m ' is projection onto the horizontal torus of G/T. 

The second result we allude to, proved in sections 9 and 10 of [13] by iterating the 
preceding theorem, is a certain factorization result. We will need a variant of it in 
the present paper involving an arbitrary growth function u : K + — > 1R + ; this may be 
established by exactly the same iterative argument that is used in the proof of [HI 
Theorem 10.2]. 

Theorem 4.2 (Factorization result). Let s, N ^ be integers, let M ^ 1 be a real 
number, and let u : M + — > M + be an arbitrary growth function. Suppose that G/T 
is a nilmanifold of complexity at most M together with a filtration G, of degree s. 
Suppose that X is an M -rational MaVcev basis adapted to G, and that g : Z — >■ G, is a 
polynomial map adapted to G,. Then there is an integer M with M ^ M = Oa/, S)W (1), 
a rational subgroup G' C G, a MaVcev basis X' for G' /T' in which each element is 
an Mo-rational combination of the elements of X , and a decomposition g = eg 1 ^ into 
polynomial sequences e, g', j : Z — y G adapted to G, with the following properties: 

(i) e : Z ->• G is (M , N) -smooth; 

(ii) g' : Z — > G' takes values in G' , and the finite sequence (g'(n)T") ne ^] is totally 
1/u(Mq)- equidistributed in G'/T", whenever T" is a sublattice ofT' of index at 
most u(M ), and using the metric dx on G'/T"; 

(iii) 7 : Z — > G is M -rational, and {^{n)T) n& i is periodic with period at most M . 



We feel rather sorry for our readers at this point. One particular advantage of the non-standard 
analysis approach to be taken in the more general paper to follow is that the need for arbitrary growth 
functions u is eliminated. 
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Remark. The terms "smooth" and "totally equidistributed" in this sort of context 
will not feature elsewhere in the paper, as we shall rely only on this theorem to prove 
Theorem 14.31 below. An extremely similar deduction was utilised (and proved in some 
detail) in §2 of rj5]. 

Sketch proof. The main idea is to iterate Theorem 14. II using the following "dimension 
reduction argument" . At any given stage of the argument, one has an initial factorisation 
g = eg ,r y obeying all the properties claimed in the theorem for some Mo, except for the 
equidistribution conclusions on g' . (Note that one can trivially obtain such an initial 
factorisation by setting e and 7 to be the identity, and G' = G.) If g' obeys the stated 
equidistribution properties, then we are done. Otherwise, by appealing to Theorem 14.11 
and refining to a finite sublattice of r" if necessary, the horizontal coefficients of g'{n) 
will contain an approximate linear dependence in the sense of fl4.ll) . One can then use 
this, following the arguments used to prove [14, Theorem 10.2], in order to factorise 
g' = e'g' >r y', where e', 7' satisfy similar properties to e, 7 but with a worse value of 
Mo, and g" takes values in a connected subgroup G" of G' of strictly lower dimension. 
We then absorb the e' and 7' factors to £,7, replace G' by G", increase Mo to a larger 
quantity depending on Mo and u, and continue the argument. Since one cannot have an 
infinite descent of connected subgroups of G, the argument must eventually terminate 
with a factorisation with the desired properties. □ 

The theorem below is a quantitative version of an equidistribution result of Leibman 
[21~] stating that the orbit closure of a polynomial sequence (g(n)T) n£ fq is a finite union 
of subnilmanifolds Yj, each a closed orbit of a connected closed subgroup Hj of G; 
moreover the polynomial sequence visits each Yj periodically, and is well distributed 
there with respect to the normalized Haar measure. Much the same argument (with 
more details) is given in Section 2 of [To] . 

Theorem 4.3 ("Quantitative Ratner" result). Let s, N ^ be integers, let M ^ 1 be a 
real number, and let u : R + — > 1R + be an arbitrary growth function. Suppose that G/Y is 
a nilmanifold of complexity at most M together with a filtration G, of degree s. Suppose 
that X is an M -rational Mal'cev basis adapted to G, and that g : Z — >■ G, is a polynomial 
map adapted to G m . Then there is an integer M with M ^ M = Om )S)U (1) and a 
decomposition of [N] into subprogressions Pj, each of length at least N/Mq, together 
with Mo-rational connected subgroups Hj ^ G and elements Xj G G with coordinates at 
most Mo such that (g(n)Y) n& p. is 1/oj (Mo) -equidistributed on XjHjY/T for each j . 

Sketch proof. In Theorem I4.2[ take a growth function u' : M + — > M + even more 
rapidly growing than the u in the statement here. Let g = eg'^ be the resulting 
decomposition. Take the progressions Pj to have common difference q, the period of 
7(n)T, and length sufficiently small that the smooth term e(n) is almost constant on 
each Pj. Choose yj,jj such that e{n) rs yj and j(n)T = jjT for n G Pj. Then the 
theorem holds with Hj := j^G'^j and Xj := yflj- Note that the action of conjugation 
by 7j moves T to a slightly different subgroup of G, but this new group intersects T in a 
subgroup of index Om (1), and so one can proceed by using the fact that g' is assumed 
equidistributed with respect to such subgroups also. □ 
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5. Free nilpotent Lie groups and free nilcharacters 

In previous papers in additive combinatorics in which nilsequences have been dis- 
cussed, such as [HI [H], the Heisenberg nilmanifold has been the central example and 
readers have been encouraged to think of upper triangular matrix groups as the arche- 
typal nilpotent Lie groups. A key innovation in this paper and the sequel [17], strongly 
inspired by the recent work of Leibman on bracket polynomials [22], is a shift away from 
this viewpoint. Instead, it seems that free nilpotent Lie groups and certain functions 
on them play a crucial role. 

In this section we give some basic definitions in this regard in the 2-step case. In 
Appendix [E] we will briefly meet an example of the 3-step case, but for the most part 
we will be working with 2-step objects in which case it is not a particularly onerous 
task to proceed very explicitly. The definitions in the higher step case are similar but 
necessarily require some more general discussion of bases in free nilpotent Lie algebras. 

Definition 5.1 (Free 2-step nilpotent Lie group and nilmanifold). By the free 2-step 
nilpotent Lie group on generators e\, . . . , e& we mean 

:- {e 1 ... e k e [2>1] . . . e [A . )fe _ 1] : ti, . . . , t k , t[2,i], • • • , t[fc,fe-i] £ U*J, 

subject to the relations e~ 1 e~ 1 e i e :) - = [e^e,] = e^-j for 1 ^ j < i ^ k. By the standard 
filtration G, we mean simply the lower central series filtration with G/q\ = Gm = G, 
G(2) = [G,G] and Grg) = {id}. Inside G we take the standard lattice 

r f c m i m k m [2,i] W[fc,fc-l] c 7 l 

1 ■— {e 1 ■■■e k e [21] . . . e^ k jfc _ 1] : mi, . . . , m k , "^[2,1], • • • , ^[fc,fe-i] t ^j- 

The quotent G/Y is then called the free 2-step nilmanifold on k generators. 

A Mal'cev basis for G/Y consists of the elements Xj = loge^ and = loge^; 

the Mal'cev coordinates of an element of G are simply the elements 

(ti, . . . , t k , t[ 2 ,l], • • • , £[fc,fc-l])- 

As in [TJ], such a basis may be used to coordinatise G/Y by identifying [0, l]^ 4 "^) as a 
fundamental domain for the right action of Y on G. Let us perform a calculation. In 
Mal'cev coordinates it is easy to check that the multiplication law on G corresponds to 
the operation 

(ti, t[i',i\) * (Ui, U[i',i]) = (ti + Ui, t[i' ti ] + U[i' ti ] + ti'Ui). 

For a given element g £ G with coordinates we may pick some 7 £ Y such 

that <77 has coordinates in the fundamental domain T = [0, l] fc+ (2) £ IRk+G). Possible 
coordinates for 7 are 

U{ = —[ti],U[i' t i] = —[t[i\i] —ti'[ti]], 

where [ ] is the floor function. These coordinates are unique if (77 lies in the interior of 
the fundamental domain J 7 . The coordinates of #7 are then 

({U}, {t[i',i\ — ti'[ti]}). 

Definition 5.2 (Coordinates). Suppose that G/Y is the free 2-step nilmanifold on k 

generators. Suppose that an element g £ G has Mal'cev coordinates ti,t[i^{\. Then the 
coordinates of gY E G/Y are the entries of the vector 

({ti}, {t[i',i] — ti'[ti]}). 

We write them as (U,^^. 
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Definition 5.3 (Coordinate functions). By the basic coordinate functions Ft, Fur a : 
G/T — > C we mean the functions F^t) = e(ij) and Fw )f i(t) = e(t^/^] ) . The top order 
basic coordinate functions Fuin will have a particularly important role to play 

Consider now a polynomial sequence g : Z — >• G of the form 

:= • • •)6",9[2,i](n), . . .,g [M ._i](n)), (5.1) 

where the qy^j are quadratic polynomials. By the theory developed towards the end 
of §6 of [H] (or simply by a short direct calculation), these are degree two polynomial 
sequences adapted to the standard filtration G, based on the lower central series. The 
objects Fi(g(n)T), F[i' t i](g(n)T) are then called free 2-step nilcharacters, and they will 
be basic building blocks in this paper. The top-order nilcharacters involving ify,*] will 
play a particularly crucial role. In the light of the above computations these top-order 
free 2-step nilcharacters may be computed quite explicitly, and indeed we have 

F[ ilA (g(n)T) = e(&n|j;i'n|)e(ay A n 2 + (5.2) 

for some otuia, {3[i\i\ € R. By altering the quadratics qui An) we may make the coef- 
ficients ai[i> t {\, arbitrary. These quadratic phases e(an 2 + (3n) should be thought 
of as essentially 1-step objects, albeit of degree 2, and the most important feature of 
our 2-step nilcharacters are the bracket monomials £j'n|_£i7ij. We will often use explicit 
bracket-quadratics in this paper. In the longer paper to come, dealing with the general 
case, it will not be possible to proceed so explicitly and indeed the main new innovation 
of that paper (following the work of Leibman) is to develop a kind of "calculus" of 
bracket polynomials. 

Let us note that F^^(g(n)T) is not actually a 2-step nilsequence, because the function 
ifyji] is only piecewise Lipschitz. From the explicit form given above and Lemma l3.6[ 
however, one sees that it is an almost 2-step nilsequence. 

We now give a variant of the U 3 inverse theorem involving 2-step free nilcharacters. 

Theorem 5.4 (Inverse theorem for U 3 , variant). Suppose that f : [N] — > C is a 1- 

bounded function with \\f\\u s ^ 8- Then we have |E ne nv]/(n)x(n)| 3>a 1, where 

X (n) = e{an 2 + 0n) J] F^ i] (g(n)T) 

is the product of some free 2-step nilcharacters with a quadratic phase. Here, k = Og(l) 
and the are integers bounded by 0$(1). 

Proof. In [Til Theorem 10.9] it is shown that a function / with ||/||c/ 3 ^ 6 has in- 
ner product ^>s 1 with a function which is the product of Og(l) bracket quadratics 
e(a;n|_/3jnj), % = 1, . . . ,m and a quadratic phase e(an 2 + fin). But such a function 
already has the form \ given in the statement of the theorem, simply by taking k = 2m 
and horizontal frequencies £24-1 — A an d £2* = c^, i — 1, 2, . . . , m. □ 

Remark. The proof of [ill, Theorem 10.9] was actually a stepping stone on the way 
to the proof of the U 3 inverse theorem itself, which requires these 2-step nilcharacters 
to be assembled into a Lipschitz Heisenberg nilsequence. 

Remark. It is possible to proceed directly from the U 3 inverse theorem, that is to 
say from the formulation given in Conjecture 11.3} although - as the previous remark 
suggests - it would be a little perverse to do so. To do this requires one to do a slightly 
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odd kind of Fourier decomposition in the coordinate space (U, = [0, l] fc+ (2) , mapped 

onto the torus (R/Z) k+ ^\ but there is an issue because a function F which is Lipschitz 
on G/Y need not even be continuous on this torus. We have a way around this difficulty 
involving the introduction of a random shift to the fundamental domain J 7 . However 
we do not believe this argument will be necessary even in the more general paper to 
come, since our plan is to first prove a variant form of Conjecture 11.31 akin to Theorem 
I5.4[ by induction and only then to deduce Conjecture 11.31 itself. 

To conclude this section we give some crucial identities involving bracket quadratics. 
It is the proper understanding and generalisation of these that we referred to above 
when we talked about the development of a "calculus" of bracket polynomials in the 
forthcoming longer paper. 

The key identity we shall rely on is 

X[Y] = XY- {X}{Y} - [X]Y + [X][Y], (5.3) 

valid for all X,Y G M. This implies that the map <f> : (X, Y) y X[Y](mod 1) is 
"antisymmetric and bilinear modulo lower order terms". Specifically, <p(Xi + X 2 , Y) — 
</>{X u Y) - (f>(X 2 ,Y) = 0, whilst ^X,Y, + Y 2 ) - >i) - <f>(X,Y 2 ) = {X}{Y 1 } + 
{X}{Y 2 } - {X}{Y l + F 2 }, and <f>(X, Y) = X) + XY - {X}{Y}. 

Let us say a clarify to some extent what we mean by "lower order". We shall be 
applying these identities when Xi = ain and Yj = (3jn, and we shall also be considering 
e(0) rather than itself. Then these obstructions to antisymmetric bilinearity take 
the form e({an} {(3n}) , an almost 1-step nilsequence (cf. Lemma 13751 (iv)) and e(9n 2 ), 
another 1-step object (but of degree 2). 

Let us record these observations in the form of a lemma. 

Lemma 5.5 (Bracket quadratic identities). Suppose that a, aii, a 2 , (3, (3i, (3 2 , 7 G E. 
Then 

(i) e((«i + a 2 )n[(3n\) = e(ain[f3n\)e(a 2 n[/3n\) ; 

(ii) e(an[((3i + f3 2 )n\) = e(an[(3in\)e(an[/3 2 n\) up to a product of terms of the 
form e({8n}{6'n}) ; 

(iii) e(an\_f3n\) = e(— Pn\_an\) up to a product of terms of the form e(8n 2 ) and 
e({6'n}{9"n}). 

(iv) e(7n|_7nj) is a product of terms of the form e(8n 2 ) and e({8'n}{8"n}) . 

Proof. The first three of these follow immediately from (15.31) and the subsequent dis- 
cussion. Part (iv) perhaps requires some comment: to prove it, first choose 7' so that 
27' = 7(mod 1). Then take a = (3 = 7' in (iii) to obtain the fact that e(7n|_7'^J) is a 
product of terms of the required type. Now apply (ii) to conclude the same thing for 
e{^n\p/n\). □ 

6. Some arguments of Gowers 

In this section we give the observation of Gowers [7] described in §2J whereby one 
proceeds from the assumption that \K n A h f(n)xh( n )\ ^ d f° r many / to get (I2.3p . a 
kind of weak linearity statement concerning the map h 1— >■ \h- Here is a more precise 
statement. 
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Proposition 6.1 (Gowers). Suppose that f : [N] — >■ C is a 1-bounded function, that 
H C [N] is a set with cardinality r/N and that for each h G H we have a function 
Xh '■ [N] C with \Xh{ n )\ ^ 1 f or a H n , and that 

\E ne[N] A h f(n) Xh (n) \ ^ 5 (6.1) 

for all h G H. Then for at least r) 8 S 4: N 3 /2 of the quadruples (hi, h>2, h^, h±) such that 
h\ + h 2 = h 3 + /i 4 we have 

\^nXhi(n)xh 2 (n + h- h 4 )xh 3 (n)xh 4 (n + h l -h i )\ ^ ct] A 5 2 . 

Remark. In the original paper [7], attention is restricted to the linear case Xh(n) = 
e(^n), but the argument extends without difficulty to the general case, as we shall see 
in the proof. 

Proof. As in many arguments of analytic number theory and additive combinatorics in 
which a function that one does not wish to understand is to be eliminated, our main 
tool is the Cauchy-Schwarz inequality. Two applications of that inequality give that 

|E n , m a n 6 m $(n, m) | 4 ^ E n ^ , m , m '$(n, m)$(n', m)$(n, m')$(n', m!) (6.2) 

whenever (a n ) ne x, (6 m ) m ey, ' m ))ngx,mgy are 1-bounded sequences of complex num- 
bers. 

Returning to the proposition itself, the assumptions imply that 

E h \E n A h f{n) Xh {n)\ 2 > 

where we have taken the expectation over some group Z/iV'Z with N' ~ 2N (say) and 
define all functions to be zero outside of [N] and Xh to be identically zero if h H. 
Expanding out and making some obvious substitutions this yields 

E k E n>m f(m)f(m + k)f(n)f(n + k)A k x m -n{n) > r]S 2 . 

Applying Holder's inequality this means that 

E k \E n>m f(m)f(m + k)f(n)f(n + k)A kXm -„{n)\ 4 > V 4 5 8 . 

Applying f)6.2p for each k, we obtain 

E k E n:n > !rn!m >A k Xm-n(n)A k x m '-n(n)A k x m -n'(n')A k x m '-n'(n) > rf8 8 . 
This is more suggestively written as 

E hl+h2=h3+h4 E rhk A k Xh 1 (n)A k Xh 2 (n + h - h 4 )A k Xh 3 (n)A k Xh 4 (n + h x - h 4 ) > r] A S 8 , 
which is the same as 

E hl+h2=h3+h4 \E n Xh 1 (n)xh 2 ( n + h- h A )xh z {n)Xh A {n + h - h A )\ 2 > r] 4 5 8 . 
This immediately implies the stated result by a trivial averaging argument. □ 

We now give a corollary of this in the specific case that the Xh{n) have the form 
appearing in the statement of Theorem 15.41 that is to say 

Xh(n) := e(a H n 2 + fan) ]J F^ Mh) (g h (n)T) . (6.3) 
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Corollary 6.2. Suppose that f : [N] — >■ C is a 1-bounded function and that 

\E nE [ N] A h f(n)xh{n)\ ^ 5 

for all h in some set H , \H\ ^ 5N. Suppose now that the functions Xh(n) have the 
specific form (16. 3p . Then for at least S C N 3 additive quadruples hi + hi = h% + hi G H 4 
there are frequencies oih^MMi fihxMMM e such that 

\^nt[N]XhM)Xh 2 {n)Xh 3 {n)XhM) e ( a hiMMM n2 + PhiMMM n )\ ^ L 

Proof. Apply Proposition 16.11 and then use the bracket identities in Lemma 15.51 to 
expand out terms such as Xhi(n + h\ — h±). This exhibits 

XhA n )Xh 2 {.ri + h x - h 4 )xh 3 (n)xh 4 (n + h x - h 4 ) 

as a product of 

Xhj (n)xh 2 (n)xh 3 (n)xh 4 (n) 
times various (possibly /^-dependent) terms of the form e(an 2 ) or e({an}{(3n}). By 
Lemma 13.51 (iv) the latter are almost 1-step nilsequences; the conclusion then follows 
from Lemma I3.4L □ 

Remark. That this computation worked was no accident. In fact from the general 
theory in [T3| one knows that if x{ n ) is a Lipschitz s-step nilsequence with a vertical 
character then x( n +^)x( n ) ls an (s — l)-step nilsequence. We did not apply this general 
theory here, since we are being forced to deal with the coordinate functions F^q which 
are not Lipschitz. 

7. Step 1: Reducing the /^-dependence 

The aim of this rather long and technical section is to handle Step 1 of the outline 
in §|2j Our first task is to formulate properly exactly what it is we intend to do. Recall 
that if ||/||t/ 4 ^ <5 then, from the fact that ||Afc/||[/-3 3> 8° for 3> S°N values of h and 
Theorem 15.41 we have 

E ne[N] A h f{n)xh{n) > 5 1 (7.1) 
where Xh is an object having the form (16. 3p . that is to say 

Xh (n) = e(a h n 2 + f3 h n) J] (g h {n)T). (7.2) 

Each term involving an Fui>] is, by the calculations in £J5] and in particular those around 
(15. 2p . essentially a bracket quadratic e(£n|_£'n_|) involving two of the frequencies 
in the "horizontal" part of the polynomial sequence gh{n). 

Let us be a little more precise and write ghiji) = (£h,i n , ■ ■ ■ > 6i,fc^> ■ ■ ■ )j thus the 
numbers £h,i are the horizontal frequencies just alluded to. Write Eh := {6i,i> ■ ■ ■ ,£h,k} 
for this set. When we outlined Step 1 earlier on, we did little more than suggest that 
our aim was to show that no bracket quadratic e(£,h,in[£,h,i'n\ ) involving two genuinely 
/i-dependent frequencies £h,i actually occurs in the formula for Xhin). 

To attach meaning to this, we will split Eh as a union E* U E' h of a "core" set 
s * = {£h,i, ■ ■ and a "petal" set E' h = {6i,fc«+i, • • • ,€h,k} in such a way that the 

frequencies £h,i, i = 1, do not actually depend on h. Our task, then, is to show 

that (17.11) and (I7.2p may be achieved in such a way that mu^h) = when i,i' > k*. 
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In other words, no bracket quadratic e{^h,i n L6i,« m J) with i,i > actually occurs in the 
expression for Xh( n )- 

We will not prove that any situation such as ( 17.1 j) and (17. 2p has this form automati- 
cally. Rather, we will perform an inductive procedure in which the underlying frequency 
sets are slowly modified so that they take on more and more characteristics of the above 
"sunflower" decomposition into core and petals. At the same time, the set of h for which 
(17. ip holds will be gradually reduced, although it will always have cardinality ~^>s N. 

Here is a precise statement. 

Proposition 7.1 (Step 1). Suppose that ||/||f/4 ^ 5. Then for ~^>$ N values of h we 
have E ne [7v] A h f(n)xh( n ) 1 ; where where 

Xh{n) = e(a h n 2 + (3 h n) F™, [ ^ i]h (g h (n)Y) 

with k,\muin(h)\ = 0$(1). Furthermore there is a "sunflower" decomposition of the 
frequency sets E h = {£ h ,i, ■ ■ ■ ,€h,k} of g h {n) into a "core" = {£ M , . . . , £ hjfc J which 
does not depend on h together with "petals" E' h = {£h,k*+u ■ ■ ■ >6i,fc}; in such a way that 
m>{i,i'](h) = if > k*. 

The last statement - that is to say the assertion that there are no bracket quadratics 
with two petal frequencies - is of course the beef here. 

Here is a plan of the rest of this section. Proposition 17.11 is proved by a kind of 
induction (on the "complexity" of the core-petal decomposition). The inductive step 
is stated as Proposition 17.51 below, and we give the full derivation of Proposition 17.11 
shortly after the proof of that. Proposition 17.51 is itself deduced from Corollary 17. 4[ 
which is in turn an easy deduction from Lemma 17.31 This latter result is the main 
business of this section, and indeed is probably the hardest part of the entire argument. 
For that reason we will, between stating it and proving it, give a kind of model variant 
of the argument to illustrate the underlying algebraic structure. 

Before we can begin we require a definition which will also feature later in the paper. 

Definition 7.2 (Approximate relations and dissociativity) . Suppose that 

S = {&,...,&} CR/Z 

is a finite set of frequencies. We say that this set satisfies an M-linear relation up to e if 
there are integers mi, . . . , m&, \rrii\ ^ M, not all zero, such that ||mi£i + - • -+?n^fc||K/z ^ 
e. If a set H satisfies no such linear relation then we say that it is (M, e) -dissociated. 
We say that a further frequency £ lies in the M-linear span of H up to e if there are 
integers m 1; . . . ,m fc , |m,j| ^ M, such that ||£ — m^i — • • • — m fc £ fc || K / z ^ e. 

Let us now state the main lemma of this section. We remark that the hypothesis of 
this lemma comes from applying Proposition 16. II to the assumption (17. ip . However we 
shall revisit this point later on when we actually perform the inductive application of 
the lemma. 

Lemma 7.3. Fix hi, h 2 , h 3 , G H and suppose that for j = 1, ... ,4 we have a decom- 
position Efy = U E' h ^ of the frequency set 5^ into a core = • • • , 6ij,fc«} not 
depending on j and a petal set E' h . = {£^,fe»+i, . . . , £hj,k}- Suppose that the functions 
Xh{ n ) have the form (16.31) above, where both k and the indices mw^h) are bounded by 
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M , and suppose that we have 

\^ne[N]XhM)Xh 2 (n)xh z (n)xhM) e ( a h u h 2 MM n2 + 0hi,h 2 ,h 3 ,h4 n )\ > l / M - ( 7 - 3 ) 
Suppose that m^^hi) ^ for some pairi, i! > k*. Then either there is an OM{X)-Hnear 
relation, up to Om(1/N), between the elements in S* U E' hi U E' h2 U E' ha , or else there is 
such a relation between the elements o/S* U E' hl U E' h2 U E r h4 . 

Proof. The main idea is to apply the distributional results on nilsequences, and in 
particular the "Quantitative Ratner" result, Theorem I4.3[ to the assumption (17. 3p . 
There is a very natural way to do this, which is to write (17. 3p as 

\E ne[N] F{g{n)T)\ > l/M, (7.4) 

where of course 

F(g(n)t) := e{a hlMMM n 2 + (3 hlMMM n) JJ JJ F^'^ (g hj (n)T). 

l^i<i'^kj=l, 2,3,4 

We may interpret the left-hand side as one big polynomial nilsequence on the 2-step 
nilmanifold G/t, where G = GxGxGxGxRandf = rxrxrxrxZ, and the 
polynomial sequence g = g^^MMi 71 ) * s gi yen by 

g( n ) = 9hiM,hsM( n ) = 9hA n ) x 9h 2 (n) x g hs (n) x g hi {n) x (a hlMMM n 2 + /3 hlMMM n). 

The Quantitative Ratner results are a little complicated, and so before continuing 
with the proof we sketch how it goes in what might be termed the asymptotic limit case, 
in which we work not with any given scale N, but rather with the limiting behaviour 
as iV — 7- oo. More precisely, instead of (I7.4p we assume merely that]] 

lim E ne[N] F{g(n)T) + 0, 

N— s-oo 

and instead of finding quantitative relations amongst the frequency sets we merely 
conclude that the frequencies in either 5* U H^ a U Eh 2 U E^ 3 or U U U 
are rationally dependent. The main difference between the model case and the actual 
one is that the corresponding nilmanifold distribution results, due to Leibman [21], are 
much cleaner in this setting. For simplicity of notation (in this sketch) let us suppose 
that = 0. 

Suppose, then that S/^US/^US/^ and S/^US/^US^ are both rationally independent. 
Consider the orbit (^(n)f ) neN in G. Roughly speaking^, the results of [2T] assert that 
this orbit is equidistributed on a subnilmanifold of the form HT/T, where if is a closed 
connected rational subgroup of G. 

If F were continuous then this would imply that 

lim E ne[N] F(g(n)f) = [ F{x)dm H {x) £ (7.5) 

where ran is the Haar measure on HT/T. Unfortunately F is not quite continuous, 
a further technicality we will have to handle when discussing the proof of Lemma 17.31 
proper. For the purposes of this sketch, however, let us assume that (17.51) holds. 

4 Thc convergence of all limits involving polynomial nilsequences was established in [3T] , at least in the 
case when F is continuous. 

5 In actual fact this is only true after subdividing N into finitely many subprogressions, and furthermore 
we need to work with a translate xqHT/T. Both of these points are merely technical. The finitary 
analogue of this result is, of course, the Quantitative Ratner Theorem, Theorem 14.31 
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Let 7Ti, 7r 2 , 7r 3 , 7r 4 : G — > G be the projections of G onto each of the four factors of G 
comprising G, and by abuse of notation use the same notation for the projection maps 
from G/Y to the factors G/Y. Now the projection (jrx x tt 2 x ir 3 )(g(n)) has, as its set of 
horizontal frequencies, E hl UE h2 UE h3 , a set which is rationally independent. But these 
frequencies are precisely those occurring in the projection of (tti x 7r 2 x 7r 3 )(^(n)r) onto 
the horizontal torus (abelianisation) of G /Y x G /Y x G/Y, and hence the orbit of this 
(abelian) nilsequence is dense. However Leibman's criterion^ asserts that a polynomial 
nilsequence is dense if and only if its abelianisation is, and so ((tti x 7r 2 x 7f3)(g(n)T)) nG ^ 
is dense in G/Y x G/Y x G/Y. 

Since (g(n)Y) neN equidistributes in HY/Y, we must have 

(tti x tt 2 x 7r 3 )(#f/f) = G/Y x G/Y x G/Y. 

Topological argument^ using the fact that H is closed and connected let us lift this 
statement to G to conclude that 

(7Ti X 7T 2 x 7T 3 )(H) = G x G x G. (7.6) 
By exactly the same argument we have 

(tti x 7T 2 x rt 4 )(H) = G x G x G. (7.7) 

We claim that as a consequence of these observations we have 

[G, G] x id x id x id x id C H. 

To see this, let g,g' G G be arbitrary. Then (17. 6p implies that H contains an element 
of the form (g, id, id, x, z), for some x G G and some z G M, whilst (17. 7p implies that 
H contains an element of the form (id, g', x', id, z') for some x' G G and some z' G IfL 
The commutator of these two elements is ([g, g'], id, id, id, id), thereby establishing the 
claim. 

Remark. This idea has appeared in related contexts before, for example in the work 
of Furstenberg and Weiss [6], as well as in less related contexts such as a paper of 
Hrushovski [191 Lemma 4.11]. 

As a special case of the above claim, we see that for each pair i, i' with 1 ^ i < i' ^ k 
and for each t G R the element z := (e| id, id, id, id) lies in H. It follows that 

/ p ^ dm «^ = / *<«)*»»<*)■ 

However a direct calculation using the definition of F confirms that 

F(zx) = e(tm[ i)i i](h 1 ))F(x). 

Since t is arbitrary, the only way to reconcile this with (17.51) is to conclude that 
m [i,i'](hi) = 0. Thus in this case (in which there is no core H # ) we see that either 
the functions Xh{ n ) are somewhat trivial in the sense that all of the muin(h) vanish, 
or else we were wrong to assume that the frequencies in both U U Eh 3 and 
H/jj U Eh 2 U Eh 4 are rationally independent. 

This concludes our sketch of the asymptotic limit case, and we now return to our 
original task of proving Lemma 17.31 The underlying idea is the same as in the above 

6 The Unitary analogue of this is the Quantitative Leibman dichotomy, Theorem 14. II 
In the unitary world these are somewhat painful and involve, for example, some quantitative linear 
algebra; see Appendix lAl 
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sketch except that everything must be made quantitative, without any recourse to 
limits. Furthermore there was one point in the above sketch where we treated a special 
case (the core set is empty) and others where we waived our hands somewhat (the 
function F is not Lipschitz, the orbit only equidistributes on a coset of a nilmanifold, 
and then only after passing to a subprogression) . These issues must, of course, be dealt 
with properly. 

Consider the orbit (g(n)T) ne \m. Let u : 1R + — > M + be a growth function to be 
specified later. By Theorem 14.31 there is some Mq = Ow,m(X) (which we may clearly 
assume to be at least max(M, #G/Y), since both of these quantities are Om(1)) with 
the following property. We may partition [N] into subprogressions Pj with lengths at 
least N/Mq, such that corresponding to each progression Pj the uniform measure 

1 

to : - TpT 1^ l ~9{n)V 
' 3 ' nePj 

is l/a;(Mo)-close to the Haar measure mjj. on bjHjY/Y, where Hj ^ G is some closed, 
connected, M -rational subgroup. Namely for any Lipschitz function FonG/r we have 

\E neP F(~g(n)Y) - [ Fdm Hj \ ^ _L_||F|| Lip (7.8) 

By a trivial averaging argument, condition (17. 4p implies that there is some P = Pj 
such that 

\E neP F(g(n)Y)\ ^ 1/M; (7.9) 

let H = Hj be the corresponding group, and m# the Haar measure on bHY/Y. 

Let z G [H, H] be an element, all of whose coordinates are bounded by Om (1), and 
let F : G/Y — > C be a Lipschitz function. Then F z (xY) = F(zxY) is also Lipschitz 
and 1 1 .F z 1 1 Lip = 0M o (l)||-F||Lip- Furthermore since ran is invariant under translation by 
z (which lies in the centre of G) we have 

F z drriH = / FdrriH, 



and thus from (17. 8p we get 

\E n epFz(g(n)Y) ~ J Fdm H \ = Mo (l/u(M ))\\F\\ Lip . 

And by the triangle inequality 

\E neP F z (g(n)f) -E neP F(g(n)Y)\ = Mo (l/u(M ))\\F\\ Lip , (7.10) 
thus if uj is sufficiently rapidly-growing then the error term here is negligible and thus 

E neP F(zg(n)Y ) w E neP F(g(n)Y ). (7.11) 

Let 6m be the quantity from the lifting Proposition IA.4I Namely any element of x 
of H [G, G]Y/[G, G]Y = M. m /Z m whose coordinates are bounded by 6m has a lift under 
the natural projection G — > G/[G,G]Y to an element in H with coordinates Om (1), 
whose first m coordinates are the reduced coordinates of x. 

We now deal with the issue of F not being Lipschitz. Fix 5 = jq£m - We first need 
to modify the function F. We will choose two parameters 81,62, such that 61 is much 
smaller that 6q, and 82 still smaller depending on 61. However, both these quantities 
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will be 3>m,u> 1- Consider the distribution of some fixed coordinate th h [i',i\ of g{n)Y as 
n varies over P. We may clearly suppose that there is no Oju >aj (l)-linear relation, up to 
Om,u{]-/N), amongst the frequencies H^US^ . since otherwise the conclusion of the lemma 
is trivially satisfied. If there is no such relation, and if the implicit constants in the 
Om, u (1) notation above are chosen sufficiently large, then by the quantitative Leibman 
dichotomy, Theorem 14. 1[ the sequence (gh j {n)Y) n( zp is <5 2 -equidistributed in G/T. Fix 
a j G {1,2,3,4} and a pair with 1 ^ i < i' ^ k. Let ip = if)j,i t i' : G/T — > [0, 1] be 
supported where th.j,[i'A ^ 25i or th jt [i>,i\ ^ 1 — 2<5i and be equal to 1 whenever ^ Si 

or t hj ^ A ^ 1 - Si and have 1 1 -0 1 1 Lip = Mj(5l (l). Let ■ip = V> iiM > : G/f -> [0,1] be the 
pullback of ipj i ii under the natural projection from G to the jth copy of G. 
Our preceding observation about the distribution of (gh j {n)T) n& p implies that 

|E ne F^(y(n)f)| = \E n€P ^{g hj {n)T)\ < / ipdm G/T + S 2 \\i/j\\ Li p 

J G/r 

= 0M;5i->-o(l) + Om,S 1 (S 2 ), 

where om-,5i->o{1) denotes a quantity that is bounded in magnitude by cm{Si) for some 
cm (Si) that goes to zero as Si — > for any fixed M. 

Let z be an element in [H,H] with Mo (l)-bounded coordinates, and suppose z = 
(zfa, Zh 2 , Zh 3 , Zh 4 ,w) under the decomposition ofGasGxGxGxGxl, Then 
i) Zh {g hi {n)T) = ip(z h g h (n)T) is Lipschitz with HVvIIup = °m (^-), and by the in- 
variance of mc/v under multiplication by z^. we get 

\E neP ijj(zg(n)f)\ = o Mo . Sl ^ (l) + Mo>6l (S 2 ). 

By adding, the same type of bounds hold for the function $ := X);=iEi<i<i'<it^i,M'> 
that is to say 

\E neP *(g(n)f)\, \E neP *(zg(n)T)\ = o Mo . Sl ^ (l) + MoM (S 2 ). 

Let us, at this point, fix S\ S>m 1 m such a way that the om ;6i-+o{1) term here is 
bounded by Sq° (say), and let us then choose S 2 S>m 1 i n such a way that the Om ,5i(^2) 
term is also bounded by Sq°. Then the last displayed equation becomes 

\E neP *(g(n)f)\, |E n6P *(^(n)f)| = O(5 10 )- (7-12) 

Note that by construction \l/ is equal to 1 in a (^-neighbourhood of all of the disconti- 
nuities of our function F. As a result of this it is clear that we may find a function F Q 
with the property that 

\\Fo || Lip = Om„(1) 

whilst 

\F-F \ < * 

pointwise. By (I7.1(jp we have 

\E neP F (zg(n)T) -E neP F (g(n)T)\ = Mo {l/u(M )) 
and by f)7.12p we have 

|E neP (F - F )(zg{n)f)\, |E neP (F - F )(g{n)T)\ = O(5 10 ). 
Adding, we obtain 

\E neP F(z~g(n)f ) - E neP F(g(n)t)\ = Mo (l/u(M )) + O(5 10 ). 
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Recall that 5 = jq£m depends only on M . By choosing u : R + — > R + to be sufficiently 
rapidly-growing, the whole of the right-hand side can therefore be made 0(<5q°), that is 
to say 

E neP F(zg(n)Y) = E neP F(g(n)Y) + 0(C)- (7.13) 
Now that co has been fixed, we have M = Om(X) and 5 ^>m 1- As stated before, 
our aim now is to assume that U U Eh 2 U and U H/^ U U H^ 4 are highly 
dissociated and use this to produce an element z G [H, H] which, in conjunction with 
f)7.4p . contradicts (I7.13p . We shall require a further parameter 6 3 ^>m 1, much smaller 
than <5 . We will specify it later on. 

Let 7Ti, vr 2 , 7r 3 , 7r 4 : G — > G be the projections from G onto the four copies of G (recall, 
of course, that G = GxGxGxGxR). Once again we abuse notation and use 
the same notation for the corresponding projections from G/Y to G/Y. Suppose that 
2* U E hl U E h2 U E ha is Oj^(l)-dissociated up to Om(1/N). Let us examine the abelian 
part of (7ri x 7r 2 x n- i )(g(n)Y) n( zp, that is to say the image of (g(n)Y) neP under the 
projection 

tt* : G/Y G]r x G]r x G/[G, G]Y £ (M/Z) fc x (M/Z) fc x (M/Z) fc . 

This image takes the form 

((6n,i*0£i, (&x,i«)fc=*.+i> (& a ,<n)*=k.+i> (6*3,^=1, (U,i^)t=fc.+i)(mod 1) 

Recalling that S* = • • • ,&,*,„} and that E h = {£ htkm +i, ■ ■ -,&,*}, it follows from 

the asserted dissociativity (assuming the implicit Om(1) terms are large enough) and 
Kronecker's theorem in quantitative form (cf. Lemma ID.2j) . that this image is 5 3 - 
equidistributed in the subtorus 

{(t }Ul ,t } u 2 ,t,u 3 ) : t G (R/Z) k *, u u u 2l u 3 G (R/Z) k - k *} C (R/Z) k x (R/Z) k x (R/Z) k . 

In particular there is an element in (ii^2 3 (g(n)Y)) n( zp within 0(8 3 ) of 

(0,(0,...,5o,---,0),0,0,0,0) + 7r 1 a 2 b 3 (6), 

where the 5$ lies in the ith position (note that % > k* by assumption). Now since the uni- 
form probability measure on (g(n)Y) ne p is Om 0-/u(Mq)) = <5o°-close to the Haar mea- 
sure on bHY/Y, the projection (7rf 23 (g [ {n)Y)) n( zp is ^-equidistributed in it^ibHY /Y). 
This means that there is an element x of n^ 3 (HY /Y) within 0(5%) of 

(o,(o,...,5 ,...,o),o,o,o,o). 

Recall that we chose 5q so that the lifting property IA.4I holds. Recalling the relation- 
ship between distance in coordinates and distance in G (cf. [HJ Lemma A. 4]) we can 
thus find an element in (tti x 7r 2 x ir 3 )(H) at distance Om{S 3 ) from e°Z\ x z 2 x z 3 where 
Zi,z%,Z3 G [G, G] are arbitrary (with coordinates bounded by Oi; (l)). It follows that 
we can find an g G H with 

d d (g, efz x x z 2 x z 3 x w 4 x u) = O m (S 3 ) 

where G G and u G K. are arbitrary. 

Similarly, if H* U E hl U E h2 U E hA is Om (l)-dissociated up to Om(1/N) then we may 
locate inside H an element g' with 

da(g',e^z[ x z' 2 x w' 3 x z' A x u') = O m (5 3 ), 

where z[, z' 2 , z' 4 G [G, G], and w' 3 G G, u' G M are arbitrary. 
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We then take for our element z G [H, H] the commutator [g, g'\. Noting that@ 

[e^Zi x z 2 x z 3 x u> 4 x u, e^°z[ x z' 2 xw' 3 xz' 4 x u'\ = [e*? , e 5 °\ = [e»/, ej] 5 ° x id x id x id x id 

and that the maps g h- >• [g,go] are uniformly Lipschitz for g in any bounded set, we 
have 

d&(z, [e^ei] 5 * x id x id x id x id) = Mo {5 3 ). (7.14) 

Now, as we have remarked, the coordinate functions Fwn : G/Y — > C are not Lipschitz. 
However, they are OA/(l)-Lipschitz when restricted to [G,G]Y/Y, as an easy computa- 
tion confirms. It follows from this observation, (I7.14p and the definition of the functions 
that for j = 1, 2, 3, 4 and for any x G G/Y we have 

F M (*x) m r".i<^ = F [VA {x) m VA^ + Mo (5 3 ) 

unless I = i, I' = i' and j = 1 in which case 

F [i , ) f,(^) TO [*'.fl( hl ) = eiSlm^^h^F^ix)^'^ + Mo (5 3 ). 

Taking products over all choices of i, i', it follows that 

F(zx) = e{8lm[ VA {h 1 ))F{x) + Mo (5 3 ), 

from which it of course follows that 

E neP F(zg(n)Y) = e(<5 2 m [i , ii] (/ il ))E neP F(g(n)r) + Mo (S 3 ). 

Choosing 8 3 so small that the error term here is 0(6q°), we obtain upon comparison 
with (17131) that 

|1 - e(5 2 m [t/A (h 1 ))\E neP F(g(n)Y)\ = O(5 1 °). 
Recalling that muia{h\) is an integer bounded in magnitude by M, that 

\E neP F(g(n)Y)\ > 1/M, 

and that Sq may certainly be assumed to be much smaller than 1/M, we are forced to 
conclude (at last!) that q(hi) = 0. □ 

We may put Lemma 17.31 together with Corollary 16.21 in a straightforward manner. 

Corollary 7.4. Suppose that \E n€ [ N ]A h f (n)xh(n)\ ^ 1/M for all h G H, where H C 
[N], \H\ ^ N/M and Xh{n) has the form (16.31) with complexity at most M , and with 
decompositions of the frequency sets E h = • • • > £,h,k} into cores = {£^1, . . . , 
which do not depend on h and petal sets E' h = {£h,k*+i7 ■ ■ ■ > 6i,fc}- Then one of the 
following two alternatives holds true: 

(i) There is a set H' C H, \H'\ ~^>m \H\, such that mu^K) = whenever i,i' > k* 
and h G H' ; 

(ii) For > M N 3 triples h, h', h" G H 3 the set E* U E' h U E' h , U E' h „ fails to be M (1)- 
dissociated up to Om(1/N). 

8 Hcre we have used the fact, specific to the 2-step case, that [x*,?/ 4 ] = [x, y] a . One way to check 
this would be to verify it for t, t' € Z and use the fact that both sides are polynomials in a suitable 
coordinate system. In the higher step case, the more general Baker- Campbell-Hausdorff formula could 
be used instead. 
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Proof. By Corollary 16.21 there are ^>m N 3 additive quadruples hi + h 2 = h 3 + such 
that there are a hlMMM , [3 hlMihsM for which 

\^ n e[N\XhM)Xh 2 {n)Xh z {n)XhM) e { a h 1 MMM n2 + Ph x M,h 3 M n )\ »at 1. 
By pigeonhole there must either be ^>m N 3 of these quadruples such that mp^hi) = 
for all i, i' > k*, in which case we are clearly in alternative (i), or else there must be 
some choice of > fc* such that there are ^>m N 3 quadruples with m^^/ii) ^ 0. 
By Lemma 17.31 it follows that for each of these quadruples at least one of the sets 
S* US^ U~' h2 UE' h3 or S* U E' hl U E' h2 U E' hi fails to be M (l)-dissociated up to O m (1/N). 
It follows immediately that we are in case (ii). □ 

Now if alternative (i) holds in this last corollary then Step 1 is complete (that is, 
Proposition 17.11 is proven). If alternative (ii) holds, then it is possible to replace the 
core-petal decomposition = U E' h by one in which some of the petal behaviour 
is absorbed into the core. The precise statement of this, which follows now, is slightly 
long: 

Proposition 7.5. Let H C [N] be a set with \H\ ^ N/M. Suppose that 

|E ne[iV] A h /(n)^H| ^ 1/M 

for all h G H , where \H\ ^ N/M and the nilcharacter Xh{ n ) has the form (16.31) with 
complexity at most M and there is a decomposition of the underlying frequency set 
Eft = {£h,i, ■ ■ • ,£mJ mto 

• a core component = {<5i,i, • • • , £h,fc»} which does not depend on h and 

• a petal component E' h = {£ h ,k«+i, £h,k}- 
Then either 

• there is a set H' C H, \H'\ 3>m \H\, such that mui^h) = for all i,i' > fc* 
and for all h G H' , or 

• there is a set H C H, \H\ \H\ and nilcharacters Xh{ n ) of complexity 
M (1), h G H", such that 

E ne[N] A h f{n)x h {n) > M 1 (7.15) 

for all h G H . Here the nilcharacters Xh( n ) have the form 

X h (n) = e(a h n 2 + (3 h n) j J F™^' l] h (g h (n)t), 

l<i<i'<ifc 

where gh{n) = • • • > £h fa ®-> . . . , 0) . Furthermore writing 

Eh := {Ch,i, ■ ■ ■,€h,k} 

we have a decomposition Eh = 2* U E' h , where either 

(i) (core decreases) |S*| < |S*| and \E' h \ = \E' h \ or 

(ii) (petals decrease) |S*| ^ |S*| + 1 and \E' h \ < \E' h \. 

Proof of Proposition \77I\ a.k.a. Step 1. Before embarking on the proof of this last 
proposition, we remark how a simple iteration of it leads to Proposition 17.11 One starts 
with the trivial decomposition E^ = U E' h where = and E' h — E^, and with the 
initial value of M being It is not hard to see that there cannot be more than 

Om(1) iterations of alternatives (i) (core decreases) or (ii) (petals decrease). □ 
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Proof of Proposition \ 7.5[ By Corollary 17.41 we may assume that there are 3> m N 3 
triples ft, ft', ft" G H such that U E h U E h i U E^i fails to be OM(l)-dissociated up to 
Om(1/N). To each such triple is associated a + 3(k — fc*) tuple 

• • • j • • • 5 Qh',k*+i, • • • j Qh",k*+i, ■ ■ ■ , <J7i",fc 

of integers, all at most Om(1) in magnitude, such that 

+ • • • + £*,fc„ + + • • • + Qh,kih,k + QV,fc„+i6t,fc*+i + • • • 

+ Qh',k^h',k + <27i",fc*+l£fe",fc»+l + " " " + <Z/i",fc£/i",A;IU/Z = M (1/N). 

By pigeonholing we may pass to a further subcollection of triples ft, ft', ft," for which 
these integers qh,j,Qh',j',Qh",j" have no ft, ft', //-dependence. If at least one of these 
latter quantities (with j > A;*) is nonzero then by relabeling we may assume it is qh,k- 
All this having been done, let us fix ft' and ft" appearing in ^>a/ N of these triples. We 
then have integers q±, . . . qu = Om(1), not all zero, and some frequency £o such that 

llfo + <?i£*,i h 1- qkM,k* + <?fc,+i^,fc,+i h 1- qk€h,k\\wz = o M {i/N) 

for at least ~^>m N values of ft. Furthermore (case 1) we have £ = if q^+i = • • • = 
qk = 0; otherwise (case 2) we have q^ ^ 0. 

Suppose we are in case 1 and that, without loss of generality, we have q^ ^ 0. Then 
is in the Ojvf(l)-linear span, up to Om(1/N), of the set := • • • , h€*,k*-i}, 

where Q is the lowest common multiple of the integers up to 0^(1). Taking E' h = E' h , 
we see that (i) is satisfied and also that E h is in the 0^(l)-linear span, up to Om(1/N), 
of U E' h . Suppose now that we are in case 2; then take = ^S* U {^£o} an d 
E' h = gE' h \ {-qih,k}- Now condition (ii) is satisfied, and once again E h is in the Om(1)- 

linear span, up to Om(1/N), of U E' h . 

The treatment of the two cases is, henceforth, the same and at this point we revert 
to the bracket quadratic expressions 

F™^' %] h (g h (n)T) = e{m [VA {h)£ hjV n[t, Ki n\). 

For each ^ we substitute in the expression for this frequency as an Om (l)-linear 
combination of the frequencies in US' h , plus an error which is OmQ/N). To simplify 
this we use the bracket identities of Lemma [531 repeatedly to express the whole product 
Xh( n ) as a product of terms e{fh[i' ^{h)^h,i' n \^h,in\) with % < i', where the exponents 
rh[ii t i](h) are still Om(1), together with various terms of the form e(9n 2 ), e({an}{(3n}) 
and' e(an[(5n\) with (5 = O u (l/N). 

Now we may use Lemma 13.51 (ii), (iii) and (iv) repeatedly, bearing in mind the 
assumption \E ne ^]Ah f(n)xh( n )\ ^ 1/M, to remove all terms of these last two types 
and replace them by a single linear term e(8'n). Doing this and then taking the new 
bracket quadratics e(m[i%i](h)£h,i'nl£h,in\) an d writing them as nilcoordinate functions 
F\i> [ i] ^ \dh( n )F), we obtain precisely the desired conclusion (17.15p . □ 

8. Step 2: Approximate linearity 

In this section we address Step 2 of the outline in §|5J In the last section we de- 
composed the underlying frequency sets E h = {£/i,i> • ■ ■ ,£h,k} into a core set 2* and a 
petal set E' h , in such a way that no nilcharacter F^^g^n)) corresponding to two petal 
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frequencies £h,ii£h,i' appears in the expression for Xh(n). Our task now is to proceed 
from here to show that, at least for many h, the petal set E' h has a weak linear structure. 
There follows a precise statement of what we shall prove. By a bracket-linear form of 
complexity M we mean a function if> : Z — > M/Z of the form 

ip(h) =/3 + ai{M} + ■ • • + a m {(3 m h} + Oh, 
where the ctj, (3j, (3 lie in M and m ^ M. 

Proposition 8.1. Suppose that f : [N] — > C is a l-bounded function with \\f\\u 4 ^ 8. 
Then there is a set H C [N\, \H\ ^>s N, such that for all h e H we have 



Here we have 



\E nE[N] A h f(n)xh(n) \ >5 1. 

»W=eM + M n F WA i]iH) (9h(n)) (8.1) 

with k,\m[i ti >](ti)\ = O s (l), where g h (n) = (£ h ,in, . . . , £ hik n, 0, . . . , 0), m^^h) = if 
> k*, the frequency set Eh decomposes as 5* U E' h with = . . . , inde- 
pendent of h, and every frequency £h,i, i > K, in the petal set E' h is a bracket linear 
form in h of complexity 05(1). 

We shall establish this proposition inductively in a manner not too dissimilar to that 
in the last section. The inductive step which drives Proposition 18.11 is the following; it 
might be compared to Proposition 17.51 in the last section. 

Proposition 8.2. Suppose that H C [N] is a set with \H\ ^ N/M. Suppose that for 
all h G H we have 

\^ [N] A h f{n) X h{n)\ > 1/M, 

where Xh( n ) has the form (18.1 1) and the frequency set E h is decomposed as U ~^ tmct u 
^unstruct^ ^/jgj.g ^ e frequencies in do not depend on h and those in S^ truct are bracket- 
linear in h with complexity at most M. Then there is a set H C H , \H\ ^>m 1> such 
that 



where Xh{ n ) has the form 
Xh(n) = 



\E ne[N] A h f(n)xh(n)\ >m 1, 



e{a h n 2 + fan) \ \ F^' i]{h) (g h (n)f) 



a nilcharacter with complexity Om(1) in which the frequency set Eh decomposes as H* U 

^struct y gunstruct 



runstruct I 



(i) (core decreases) |E*| < |E„|, \Ef uct | < \Ef uct \, |H)j nstruct | <; \E u h n 

(ii) (unstructured part decreases) |E*| = Om ,ihj(1) ; |H^ truct | = |S^truct| 



J ^unstruct | | ^unstruct | 



1. 



Proof of Proposition lg.il given Proposition \8.S\ To prove Proposition 18.11 one first, of 
course, applies Step 1. With that in hand one may pick M = Og(l) and initialise the 
inductive use of Proposition 18.21 by taking 2^ nstruct to equal to the entire petal frequency 
E' h and «^ truct = 0. It is not hard to see that this repeated application of Proposition 
18.21 terminates in time 0^(1), at which point the unstructured component ~^ nstmct is 
empty. □ 
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It remains, of course, to prove Proposition 18.2} and this will be the main business 
of this section. Once again the key tool is Proposition 16.14 of which we require the 
following variant. 

Lemma 8.3. Suppose that H C [N] is a set with \H\ ^ N/M and that 

\E ne[N] A h f(n) X h(n)\ > l/M 

for all h G H, where Xh(n) has the form (18. ip with m^^ih) — ifi,i' > k* and the 
underlying frequency set Eh has been decomposed as 2* U ~^ tmct U ~£ nstmct ; where the 
core S* does not depend on h and 5^ truct consists of bracket linear forms of complexity 
at most M. Write Xh( n ) = xI tmct ( n )Xh nstmct ( n )j where the two parts here correspond 
to the structured and unstructured frequencies in E^. Then there is a set H C H, 
\H\ > A f \H\, and frequencies a hlM:h3M , f3 hlMMM G K/Z such that 

for ^>m N 3 additive quadruples hi + hi = hs + G H . 

Proof. The idea is to apply Proposition 16.11 and then simply observe that the contri- 
bution from the structured parts x^ tmct (n) can be made to cancel out. Bracket linear 
forms are not quite genuinely linear, but if ip{h) = ai{j3 1 h} + • • • + <y m {(3 m h} + 9h then 
we have ip(hi) + ip(h 2 ) = ip(h 3 ) + ip(hi) whenever the tuple {p\h, . . . , f3 m h){mod 1) lies 
in some cube "fj + (ij + 1)/10] (say), where the ij are integers between and 

9. By pigeonholing we may pass to a set H C H such that for each bracket-linear form 
ip(h) in ~^ tmct , and for all h G H, the corresponding tuple always lies in a cube of this 
form depending only on if), and not on h. 

By Proposition 16.11 there are ^>m N 3 additive quadruples hi + h 2 = h 3 + /i 4 G H and 
frequencies a huh2th3ih4 , PhiMMM e K / Z sucn tliat 

|EnX/ l i(n)fe( ?l )X/ l 3(n)X/ ! 4( n ) e («fti,ft2A3,ft4 n2 + ^i^.ha^")! >Af 1. 
Now the contribution to this from the structured parts, 

^struct ( n )^truct ^^(n), 

is a product of bracket quadratic terms of the form 

e(^(hi)n\9n\ + ip(h 2 )n[9ri\ - ip(h 3 )n[9n\ -^){h^n\9n\) 

or 

e(8n\_ip(hi)n\ + 9n\ip(h 2 )n\ — 9n\_ip(h 3 )n\ — 9n\ip(h/i)n\). 

For the quadruples hi, h 2 , h 3 , h± under consideration we have ip(hi) + ip(h 2 ) = ^(^3) + 
ip{h/s)i an d so the first of these expressions is identically 1. The second is not, but by 
applying Lemma \5. 51 we see that it is merely a combination of terms of the form e(9n 2 ), 
e({an}{(3n}) and e(atn\j3n\) with ||/3||r/z = Om{\/N), where a, (3 and 9 depend on 
hi, h 2 , h 3 , h±. Applying Lemma I3~5l it follows that we may completely ignore the con- 
tribution from these structured parts, although we may need to modify the frequencies 

The next task is to use a similar (but much simpler) argument to that used for Lemma 
17.31 to study the conclusion of Lemma I8T31 for a particular quadruple hi + h 2 = h 3 + /i 4 . 
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Lemma 8.4. Let h 1 ,h 2 ,h 3 ,h^ be fixed and suppose that nilcharacters Xh 3 { n ) have the 
form (18. ip . Suppose that for each j = 1,2,3,4 the underlying frequency set E^. is 
decomposed as U S^* ruct U H™ struct ; where the core set does not depend on hj 
and each element of S^ ruct is a bracket linear form ip(hj), again not depending on hj. 
Suppose that 



iEn 6[ ^r truc >)x" 

Suppose that not all of the integers m\i^{hj) corresponding to frequencies £h-,u£h-,i' > 
one of which is in 2^ nstmct , vanish. Then some there is some Om(1) -rational relation, 
up to O m (1/N), amongst the elements o/S* U ~^ struct U S^ struct U ™ truct U ~^ struct . 

Proof. Once again we interpret the assumption as an assertion about a 2-step nilse- 
quence. Perhaps the "correct" way to do this (and the manner more amenable to 
generalisation) would be to mimic the construction of the last section and apply the 
Quantitative Ratner theorem once again. However in the special case of the [7 4 -norm 
being addressed by this paper a shortcut in which only the (simpler) quantitative 
Leibman dichotomy, Theorem I4.1[ is needed and we give this here. Let us take G 
to be the free 2-step nilpotent Lie group on the ordered generating set {e^ : £ G 
S* U H^ stmct U ~™ stmct U H^ struct U H^ struct }. As in §5] we identify the "coordinate" 
functions F^> : G/Y — > C, and we take a polynomial sequence g : Z — > G whose coor- 
dinate at is £n, for all £ in the above indexing set, and all of whose other coordinates 
are zero except for that at [e^, e^] for some arbitrary pair of frequencies in the 
above set, where the coordinate of g is some quadratic q = qhiMMM( n ) ^° be specified 
shortly. Inside G take Y to be the lattice of integer points in the free 2-step nilpotent 
Lie group. Finally, take 

4 

' — ll'i-.-i,,.,*' 

i=l 

By an appropriate choice of the quadratic term q we may ensure that 

F{g{n)Y) = W • • • xZ; tmc \n)e{a hlMMM n 2 + (3 hlMMM n). 

Note that we have j G ^ r F = 0. Although F is only piecewise Lipschitz, it is nonethe- 
less the case that if (g{n)Y) n( zp is 5-equidistributed for an appropriate 5 ~^>m then 
\^ne[N]F(g(n)Y)\ ^ 1/10M, contrary to assumption. This is because, as in the last 
section, we may decompose F as a sum F + F\ where H-follLip = Om,s(1) an d Wi\ is 
bounded above pointwise by a function \l> with f G , T \f = 0(e) and H^Hup = Om iB (1). 

Thus we are forced to conclude that (g{n)Y) n( zp is not 5-equidistributed on G/Y, for 
some i5>w 1. By the quantitative Leibman dichotomy, Theorem 14. 1[ this implies that 
there is some k G Z Aiia ^ G ^\ < \k\ = O m {\), such that ||Jfe- (irog)\\ C oo [N] = O m (1/N). 
In view of the way that it o g was constructed, namely the fact that the horizontal part 
nog contains only the terms £n with £ G U ~^ struct U Struct U HJ£ Btruct U ~™ stmct , 
this is precisely the result claimed. □ 

The conclusion of Lemma 18.41 looks rather weak, but using the tools of additive com- 
binatorics pioneered in this context by Gowers (particularly in |8] Ch. 7]) it turns out to 
be enough for us to be able to impose some bracket linear behaviour on some of the un- 
structured sets ~^ nstruct . The following result concerning approximate homomorphisms 
is our key tool. We know of no source for this precise result in the literature, though we 
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feel it should be somehow be regarded as "known". It is appropriate to associate the 
names of Freiman, Ruzsa and Gowers with results of this kind. 

Proposition 8.5 (Approximate homomorphisms) . Let 5, e G (0,1) be parameters and 
suppose that fi, f 2 , f 3 , f A '■ S — > R/Z are functions defined on some subset S C [N] such 
that there are at least 5N 3 quadruples (xi,X2,X3,x±) G S 4 with x\ + x 2 = x 3 + x A and 
\\fi(xi) + ^2(^2) — /3OE3) 4-/4(^4) 1 1 M./z ^ e - Then there is a bracket linear phase if) : Z — > 
R/Z of complexity Og(l) and a set S' C S, \S'\ ^>s A, such that fi(x) = ip(x) + 0(e) 
for all x G S' . 

Proof. See Appendix O □ 



Lemma 8.6. Let H C [A] be a set of size at least N/M, and suppose that we have 
a core set and, for each h G H, sets ;»u nstmct . Suppose that |H*|, |H^ nstruct | ^ M. 
Suppose that for at least A 3 /M additive quadruples h± + hi = h 3 + h A in H there is an 
M -linear relation, up to 0(M/N), in H* U ~^ stmct U ™ truct U H^ stmct U H^ stmct . Then 
either 

(i) There is some element of the core which lies in the Ou{^)-span of the others, 
up to Om(1/N), or 

(ii) There is a bracket linear form ip of degree Om(1) and a set H' C H, \H'\ ~^>m 
\H\, such that ip(h) lies in the Om(1) -linear span up to Om(1/N) of [-instruct 
for all h G H'. 

Proof. Let the elements of the core set 5* be . . . , £*,m} and those of the petal 
set E h be {£^1, . . . ,6i,a/}- Suppose, for a given quadruple hi + h 2 = h 3 + h A , that the 
approximate linear relation between the elements of S* U U Eh 2 U Eh 3 U Eh 4 is 

II?*, 1(^1, h 2 , hs, ^4)^,1 + • • ■ + q*,hi(hi, h 2 , h 3 , /i4)£*,M 
+ qi,i(hi, h 2 , h 3 , ^4)^1,1 H h ?i,at(^i, ^2, ^3, h A )^ hlyM 

H h 54,1(^1, ^2, ^3, M6u,i H 1" <?4,m(^1, ^2, &3, M6i 4 ,Af ||]R/Z = 0(M/N), 

where each integer q has magnitude at most M. There are only (2M+ l) 5M choices for 
these integers and so we may pass to a subcollection of ~^>m A 3 quadruples for which 
there is such a relation and for which none of the g's depend on hi, h 2 , h 3 , h 4 . Since 
is M-dissociated, at least one of the q it j must be nonzero, i = 1,2,3,4; without loss of 
generality, suppose that qi t i 7^ 0. 

Writing f\(hi) := qi^hui + • • • + Qi,M^hi,Mj we see that we have found functions 
h, fa, fa ■ H ->■ R/Z such that 

\\h(hi) + f 2 (h 2 ) - f 3 (h 3 ) - h(h A )\\ m ^ l/N' 

for 3>m A 3 additive quadruples hi + h 2 = h 3 + h A G H, for some A' ^> N/M. Now we 
apply Proposition 18.51 to conclude that there is a bracket linear phase ip of complexity 
M (1) such that f x (h) = ip(h) + M (1/N) for all h in some set H' C > M A. 

This concludes the proof of the lemma. □ 

We are now in a position to prove Proposition 18.21 which, recall, was the inductive 
step driving the main result of this section, namely Proposition 18.11 The argument is 
very similar to that employed in the proof of Proposition 17.5} hingeing on repeated use 
of the bracket identities of Lemma 1531 to expand out linear combinations of frequencies. 
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Proof of Proposition \8.S\ The assumption that \K n£ pAhf(n)xh{ n )\ ^ 1/M may be 
fed into Lemma EOl to conclude the existence of a set H' C H with \H'\ ^> M \H\ such 
that 

E n xr ruct (^)xr truc H^)xr truc H^)xr truc H^)e(«/ ll ,, 2A3 ,^^ 2 + /w^,^)) »m i 

for additive quadruples hi + h 2 = h 3 + h± in if'. This in turn may be fed into 

Lemma E3J which allows us to conclude that for each of these additive quadruples there 
is an Om(1) linear relation, up to Om{1/N), between the elements of U S^ struct u 
^unstruct y ^unstruct j ^unstruct_ There is one other possibility here, namely that in the 
attempt to apply Lemma 18.41 we find that, for many quadruples hi + h 2 = h 3 + h±, 
all of the integers m^^ijij) corresponding to frequencies one of which is in 

«uns truct ) are zero. This is a rather trivial case, however, for we may then pass to the set 

H of hi (say) appearing here, and simply delete the unstructured frequencies ~^ nstruct , 
which play no actual role in the expression for Xh{n). The conclusion of Proposition 18.21 
is then immediate in this case. 

Returning to the main line of the argument, we may then apply Lemma 18.61 to 
conclude that either 

(i) There is some element (eS, which lies in the Om (l)Tinear span of the others, 
up to O m (1/N), or 

(ii) There is a bracket linear form ip of degree 0^(1) and a set H C H', \H\ ^>m \H\, 
so that ip(h) lies in the Ojvr(l)-linear span of S™ struct for all h G H. 

These two possibilities will correspond to alternatives (i) and (ii) respectively in Propo- 
sition 18.21 To see this we proceed rather as in the proof of Proposition 17.51 making use 
once again of Lemma 15.51 as well as extensive use of Lemma 13.51 to handle the somewhat 
annoyingly non-Lipschitz 1-step objects which arise. The treatment of (i) is exactly 
analogous to the aforementioned argument, so we only describe (ii) in any detail. 

Assume that the sets ~^ nstruct are ordered as 6i,fc +i> • • • >6i,fc- We are assuming that 
there is a bracket-linear form if)(h) having the form qh,k +i£h,k +i + • • • + <lh,kih,k + 
Om(1/N), for all h G H' . Here the integers qhj are all bounded in magnitude by 0^(1) 
and so we may, by passing to a further subset H" C H', assume that they do not depend 
on h. Without loss of generality let us suppose that q^k ^ 0. Then we may write £h,k as 
an OA/(l)-linear combination of ^i/j(h) and the frequencies ^h,ko+h ■ ■ ■ > qCh,k-h P ms 
an error of Om(1/N), where Q is the 1cm of the numbers up to Ojy(l). Now we replace 
^struct by ^struct y 1 1 and ^unstmct by {^ hM+u . . . , i and then proceed to 

rewrite the bracket quadratics e(^n\_^'n\) using these new sets of frequencies by means 
of Lemma 15.51 and Lemma 13.51 exactly as we did at the end of §0 □ 

Before moving onto the next section we apply one additional piece of analysis to 
Proposition 18.11 This allows us to conclude that the quadratic frequency ah varies 
bracket-linearly in h as well. Thus, once this is done, only the linear term e(/3/ l n) does 
not have a rigid structure imposed upon it. 



Proposition 8.7. In the statement of Proposition \8J\ we may assume that the qua- 
dratic frequency ah varies bracket-linearly in h. 

Proof. We may, of course, take for granted the conclusion of Proposition 18.11 We apply 
Proposition 16.11 once again, using the same argument we employed at the start of the 
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proof of Lemma [8.31 to first pass to a subset H' C H, \H'\ ^>m N, on which all the 
bracket linear forms if) in the petals "E' h are linear in the sense that if) (hi) + if)(h 2 ) = 
if)(h^) + ip(h±) whenever hi + h 2 = h% + with hi, h 2 , h 3 , /i 4 G This gives 

^n<E[N]Xhi{n)Xh 2 {n + hi- h±)xh 3 { n )Xh A {n + h - fc 4 ) » M 1. 

As in Corollary I6.2[ this implies that Xh 1 ( n )Xh 2 ( n )Xh 3 ( n )Xh 4 ,( n ) correlates with a 
quadratic phase e(ah 1 ,h 2 ,h z M r>2 + fihiMMM 71 ) ■ Moreover a careful analysis of the proof 
of that corollary, looking at the decomposition Xhiji) = x'h( n ) e ( a h,n 2 + 0h n ), where 

X' h (n)= J] *$i AW Wn)T), 

reveals that we can take cthxMMM — a h x + c*h 2 ~ a h 3 — c*h 4 - That is, the genuinely 
bracket-quadratic objects comprising x'h( n ) om y & ve Y ^ se ^° linear terms. 

The term x'h 1 ( n )x'h 2 ( n )x'h 3 ( n )x'h 4 ,( n ) arising from the genuinely bracket quadratic 
parts is a product of terms of the form e(cm\if>(hi)n\ + an[ip(h 2 )n\ — an[ip(h 3 )n\ — 
an\if)(hi)n\) where, recall, if) (hi) + if>(h 2 ) = if)(h$) + i})(hi). Using Lemma 13751 (hi) to 
move the if) terms to the outside of the brackets and applying Lemma 13.51 repeatedly, 
we conclude that 

E ne[N] e((a' hl + a' h2 - a' hz - a' hi )n 2 + B hlMMM n) > M 1 

for all these quadruples hi + h 2 = h$ + h±, where a' h — at is a bracket-linear form of 
complexity 0^(1)- By Lemma ID. II it follows that there is some q = 0^(1) such that 

+ < - < - OiU/z = M (l/N 2 ). 

By Proposition 18.51 there is a further subset H" C H, \H"\ ^>m \H\, together with a 
bracket linear form if)'(h) of complexity Om(1) 3 such that 

qa' h = if)'(h) + M (l/N 2 ) 

for all h G H" . This means that 

a h = i;"(h) + r -± + M (l/N 2 ), 
Q 

where if)"(h) is another bracket linear form and takes integer values. Refining [N] 
into progressions of common difference q and length ^>m N small enough to make the 
Om(^-/N 2 ) error negligible, and then applying Lemma [3.51 (ii). we obtain the claim. □ 

9. Step 3: The symmetry argument 

Finally we turn to Step 3 of the programme outlined in £j2j the so-called symmetry 
argument. Here we shall take an approach somewhat different to the one we shall 
employ in the general case of the [/ s+1 -norm, s ^ 4, where further use is made of the 
nilmanifold distribution results of §Hand there are slightly complicated issues concerning 
the keeping-track of the complexity of various bracket expressions. 

In the special case of the [7 4 -norm that this paper is concerned with, a rather direct 
argument using Bohr sets is possible. Let S = {9i, . . . , 9d} C R/Z be a set of frequencies 
and suppose that p G (0, 1). Then we set 

B(S, p, N) := {n G [pN] : \\n9j \\ m < p for aU j = 1, . . . , d}. 
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We shall need a small amount of the theory of such sets, particularly pertaining to the 
notion of regularity - the idea that there is a plentiful supply of p for which the size of 
B(S,p',N) is nicely controlled for p' « p. The need to introduce this idea in additive 
combinatorics was first appreciated in j5] and it has now appeared in several places, for 
example [TT] where the notion is defined in Definition 2.6 and discussed in more detail 
in Chapter 8. 

For our purposes here we say that a value p is regular if we have 

\B(S, (1 + K )p, N)\ = \B(S, p, N)\(l + 0(d\K\)) 

uniformly for \k\ ^ 1/d. We shall need the following facts about regular Bohr sets. It 
would be possible to obtain much more precise statements but we shall not need to do 
so here. 

Lemma 9.1 (Regular Bohr sets - Basic Facts). Fix a set S = {9i, . . . , 6^} of frequencies, 
and write B := B(S, p, N). We have the following facts. 

(i) (Ubiquity of regular values) For any po G (0, 1/2) there is a regular value of p 
in the interval [po,2po]- 

(ii) (Fourier expansion of Bohr cutoffs) Suppose that p is regular, and that e > is 
a parameter. Then we may decompose the cutoff ls(n) as ipi(n) + ip2{n), where 
Mn) = JoM6)e(6n)d6 with fi^ : = fi\$ x {B)\ffl < C e , p and £J^(n)| <; 
eN. 

(iii) (Large generalised Fourier coefficients) Suppose that p is regular and that : 
B(S, 2p, N) — > R/Z is locally linear on B in the sense that <j)(x+y) = <p(x)+4>(y) 
whenever x,y G B. Suppose that \E, xeB e(4>(x))\ ^ T]. Then there is a regular 
value of p' , p' ^> e ,T),p 1; suc h that \\4>(x)\\m./z ^ e for all x G B(S,p f ). 

Sketch Proof. The definition of Bohr set we are using here is very slightly different 
to that used in [UJ, in that our Bohr sets are contained in [N] and not in Z/iVZ. 
Nonetheless, the proofs of the above statements are so close to those in Z/iVZ that 
we simply refer to the relevant sections of the aforementioned paper. Statement (i) is 
[Tl| Lemma 8.2]. Statement (ii) is not explicitly mentioned in [llj. To prove it, take 
V'i( ra ) = \W\^-b * ^B'(n), where B 1 := B(S,p',N) for a suitably small p' ^ £ ,p 1- The 

bound on HV'illi follows from Plancherel, whilst the bound on H^Hi is a consequence 
of the regularity of B and the observation that t^ttIb * l_B'( n ) = lfi( n ) provided that 
n <£ B(S, p + p', N) \ B(S, p - p', N). Finally, (iii) is pH Lemma 8.4]. □ 

Let us return to the main business of this section, which is to conclude the proof 
of Theorem 11.51 The main result of the last section, Proposition I8.7[ took us from the 
assumption that H/Hc/ 4 ^ 5 to the conclusion that 

Ke[N]f{n)f{n + h) Xh {n)\ > 5 1 (9.1) 

for a set H of size N, where Xh{ n ) is a product of terms of the form e({a/i}/3n|_7nj), 
e(a{(3h}n 2 ) and e{6 h n). Using the fact that |_7"<J = -yn — {•yn}, we may assume that 

k k 

Xh{n) = e 

3=1 3=1 

Later on it will be convenient to assume that 



For all h G H we have ||0/i|| K /z ^ Pi for all 9 G {a u . . . , a k , (3[, . . . , /3' k }, (9.2) 



32 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



for some small parameter p\ > to be specified later. This can be achieved at the 
expense of thinning out H somewhat to a set of size merely ^> pi> s N, as we now show. 

To demonstrate the last claim we distinguish two types of such 9. We say that 9 is 
good if the number of h e H such that ||#/i||r/z < Pi is at most lOpiiV. By refining H 
to a set H' C H with \H'\ ^ \H\ — 20pikN, we may assume that ||0/i||r/z ^ Pi for all 
h G H' and for all good 9. Note that \H'\ ^ \H\/2 if p\ is chosen small enough as a 
function of 5, as it will be later on. If 9 is not good then the sequence {n9(mod Z)} nG [jv] 
is not pi-equidistributed, and by well-known results of diophantine approximation (see, 
for example, [HI Proposition 3.1]) there is some q <C pf c such that ||<70||r/z <C P\° ' /N '. 
This means that the bracket {9h} takes on only p^ 2C values as h ranges over [N], and 
so there is a subset H" C H', \H"\ 3> Pi' k \H\, on which all these brackets are constant. 
This means that the corresponding terms in Xh{n) may be ignored, for the purpose of 
(19. ip . since they depend just on n and not on h. Replacing H by H" gives the claim, 
and henceforth we assume that (19.21) holds, remembering that we now only have the 
weaker bound \H\ ^> pi ,s N. 

Write 

T(x, y, z) := ^{ajxj^y^jz} + ^ ^-{/3-xjyz, 
i=i i=i 
so that 3T(h, n, n) is the form appearing in the definition of Xh{n). Here, there are three 
possible choices for each /3j/3, a'j/3 and it does not matter which we take; the reason for 
introducing these 3's will become apparent later. Then T(x, y, z) is trilinear on the Bohr 
set B := B(S, po, N), where S = {«i, . . . , 71, ... , 7^ /?(,..., f3' k } and the parameter 
p G [^5, jq] is chosen so that B is regular. By stating that T is trilinear we mean that, 
for example, T{x\ + x%, y, z) = T(x\,y, z) + T(x2, y, z) when all of x\, x%, X\ + X2, y, z 
lie in B. We begin by symmetrising T in the last two variables, a straightforward task. 
For each j pick some (3j such that 2/3 j = (3j/3 (there are two choices) and set 

k k k 1 

f(x, y, z) := ^{ajxypjyijjZ} + ^{ajX^z^y} + ^ -^-{/3' j x}yz. 
j=i j=i j=\ 

Then of course T(h,n,n) = T(h,n,n), but now T(x,y,z) is symmetric in the last two 
variables. Dropping the tildes, we assume henceforth that T itself is symmetric in the 
last two variables. 

Our assumption, then, is that 

\Kie[N]f(n)f(n + h)e(3T(h,n,n))e(9 h n)\ > 5 1 

for all h lying in some set H of size at least ^> pi ,s N, where H additionally satisfies 
(19. 2p . Our immediate goal is to localize the variables h and n to small Bohr sets so that 
we may properly exploit the trilinearity of T. 

Let us briefly reprise the heuristic mentioned in the $2] to recall why it is that we 
expect T to be symmetric in the first two coordinates as well (on a "nice set"). Suppose 
we knew that f{n)f(n + h) = Xh{n) = e(3T(h,n,n))e(9hn) for all n, h. Then we get 

Xh(n + k)xk(n) = f(n)f(n + h + k) = Xkiji + h)xh(n). 

Using the trilinearity of T and symmetry in the last two coordinates we get 

6T(h, k, n) = 6T(k, h, n). 
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Now of course we do not have proper equations but only correlations, we don't have 
correlation for all h but only for "many" , and we have trilinearity only when the variables 
are restricted to Bohr sets, so we must work much harder. 

We start with the h variable. Set B\ := B(S, p±, N), where p\ is the (as yet unspec- 
ified) quantity appearing in ( 19. 2p . Modifying p\ by at most a factor of two, we may 
assume that B\ is regular. We claim that it is possible to find an ho G H such that the 
intersection H fl (ho + B\) has size 3> Pl ,s N. A slight trick is necessary to establish this: 
consider 

^2 1h * Is' * lB'(n)l H (n), 

ne[N] 

where B' := B(S, p±/2, N). On the one hand this equals 1# * l B i(n) 2 which, by 
the Cauchy-Schwarz inequality, is 3> Pl ,s A^ 3 . On the other hand we have 1b> * Ib'(^) ^ 
\Bi\lBiin) for n G [N], and from these two inequalities the claim follows immediately. 
Our assumption now implies that 

\^ne[N]f(n)f(n + h + ti)e(3T(h + h' ,n,n))e(6 ho+h m)\ > 5 1 

for all h' lying in some set H' C B x = B(S, p\, N), \H'\ ^> pi ,s N. By the careful 
construction of H (cf. (I9.2p ) and the fact that ho G H we have {<x,(/io + h')} = 
{djho} + {ajh'}, and similarly for the /3j, and hence we obtain the linearity property 
T(ho + h', n, n) = T(ho, n, n) + T(h', n, n). After relabelling we hence have 

|E„e[jv]/i(n)/ 2 (n + h)e(3T(h,n,n))e(6 h n)\ ><5 1 

for all h G H, where H C B x , \H\ ^> pi s N, fi(n) := f(n)e(T(h ,n,n)) and /2(n) : = 
f(n + h ). 

We must now localise the n variable, and for this we use a somewhat different trick. 
By averaging there is some no such that 

E„ G [AT]/i(n + n)f 2 {n + n + h)e(3T(h,n + n,n + n))e(d h n)l Bl (n) ><5 1. 

Now we have 

P(n + n ){-f(n + n )} 

= Pno{l(n + n )} + Pn{^n} + Pn{^no} + f3n({^(n + n )} - {^n} - {'JUq}). 

Substituting into the expression for e(3T(h, n + n , n + n )) and expanding, we see that 
the contribution from the term e(/3n {7(ri + rio)}) niay be absorbed into the linear term 
e(9hn) (by Lemma I3T5]) . as may the term e((3n{'jno}) (trivially). The term {7(n+no)} — 
{771} — {7^0} takes values in { — 1, 0, 1} according to whether 7n(mod 1) lies in certain 
intervals J" 1 , J°, and so we obtain 

E ne[N] f[(n)f^(n + h)e(3T(h,n,n))e(9 h n)x 

k 

x II ( 1 i j nei^ ie (-{ a j h }Pj n ) + \nei°. + l 7J ^j+/e({a i /i}/3 i n))l Bl (n) > 1, 
j= i 

where f[(n) = fi(n + n ) and f' 2 (n) = fz(n + n o)- It follows that there is a choice of 
£j G { — 1, 0, 1} and a 6' h such that 

k 

E ne[JV] /((n)^(n + / i )e(3T(/ i) n,n))e(^n)ni 7 . n6 ^l Bl (n) » 1. 

i=i 3 7j 
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By Lemma 13.51 we may remove the last term at the expense of changing 9' h again. 
Removing the dashes for notational convenience we now obtain 

E n£[N] f 1 (n)f 2 (n + h)e(3T(h,n,n))e(9 h n)l Bl (n) > 1. 

Here, /i(n) = f(n + n )e(T(h , n + n,n + n)) and f 2 (n) = f(n + h + n Q ), and we 
recall once more that this is known to hold for ~^ Pl ,s N values of h G B\. 

Set Xh{n) '■= e(3T(h,n,n))e(9hn)lB 1 (n). Applying Proposition 16. 1[ we obtain 

E ne[JV]Xfa(i)Xte(n + h - h 4 )xh 3 (n)xh 4 (n + h x - h 4 ) > pii5 1 (9.3) 

for at least c pi> sN 3 additive quadruples (hi, h 2 , h 3 , h 4 ) G Bi with hi + h 2 = h 3 + h 4 . 
We have already, in previous sections, extracted "top order" information from state- 
ments like this and our task here is to exploit the additional structure inherent in (19. 3p . 
particularly that present in the terms h\ — h 4 . 

Parametrising these by h\ = h, h 2 — h + a + b, = h + a, h 4 = h + b we obtain 

^>n<E[N]Xh(n)xh+a+b(n + b)xh+a(n)xh+b(n + b) > Plj5 1 

for at least csN 3 triples h, a, b with h G t + B 1 and a, b G B(S, 3p±, N). Substituting in 
the definition of Xh{n), and using the trilinearity of T we obtain 

E n e{6T{a,b,n) + (6 h + 6 h+a+b - 9 h+a - 9 h+b )n)l Bl {n)l Bl {n + b) > pi)5 1 (9.4) 

for at least c pi ^N 3 triples h, a, b with h G B 1 and a, b G B(S,3pi, N). Pigeonholing 
in h, one sees that there is some fixed h such that this holds for at least c pl ^N 2 pairs 
a, b G B(S, 3pi, N). Let e = e(pi,6) be a small positive quantity to be specified very 
shortly. By Lemma 19.11 (ii) and the regularity of B\ we may expand 

l Bl (n + 6)= / M9)e(9(n + b))d9 + ij 2 (n), 
Jo 

where HV'illi ^ C £jPlt $ and 5^ ri |V ; 2("')| ^ eN. Choosing e so that the contribution to 
(19. 4p from tp 2 (n) is negligible, we see using the triangle inequality that there is some 
9 G [0, 1] such that 

E neBl e{6T{a, b, n) + (9 h + 9 h+a+b - 9 h+a - 9 h+b + 9)n) > P1>5 1 (9.5) 

for the same fixed h and many pairs a, b as before. For each a, b write 4> a ,b{n) for the 
phase appearing here, thus 

4>a,b{n) := 6T(a, b, n) + (rj a + r]' b + < +6 )n 

where r] a := 9 h - 9 h+a + 9, r)' b := 9 h+b and r)'^ +b := 9 h+a+b . Equation (ET5]) implies that 

\E neBl e((fi aib (n))\ > Pl>5 1. 

Let £ = e(5, pi) be a small positive parameter to be specified later. By Lemma [9.11 
(iii) there is some p 2 = p 2 (e, pi, S) such that we have 

||0a,b(n)|| R /Z ^ £ 

for all n G B 2 := B(S, p 2 , N) and for these same pairs a, b, that is to say for at least 
c Pli sN 2 pairs a, b G B(S,3pi). Thus 

6T(a, 6, n) = (% + ^ + ^' +6 )n + 0(e) 
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for at least c pl ^N 2 choices of a, b G B(S, 3pi, N) and for all n £ B 2 . For at least c$N 3 
triples a,b,b' we thus have 

6T(a, b - b', n) = (rf b - rf v + < +6 - + 0(e) 

for all n £ B 2 . Writing c := a + b + 6' it follows that 

6T(c -b-b',b- b\ n) = (% - rj'y + rf c _ v - r£_ b )n + 0(e) 

for at least c plt sN 3 triples c, b, b' G B(S, 9pi, iV) and for all n G I?2- Fix some c for which 
this holds for at least c pl ^N 2 pairs b, b'; then by trilinearity of T we have 

6(T(b, 6', n) - T(6', b, n)) = K b (n) + ^(n) + 0(e) 

for all these pairs b, b' and for all n E B 2 , where K b : = r]' b n — T(c, 6, n) + T(b, b, n) — rj"_ b n 
and K b , = —rj' b ,n + r]"_ b ,n — i/;(c, b', n) + ip(b' , b', n). The exact form of these expressions 
is not relevant, as we shall very shortly see. 

Indeed for at least c Plt sN 3 triples bi,b2,b' G B(S, 3pi) we have 

6T(6i - b 2 , b', n) - T(V h - b 2 , n) = K bl (n) - n b2 (n) + 0(e) 

for all n G B 2 , and hence for at least c P1j sN 4 quadruples bi,b 2 , b[, b 2 G B(S,3pi) we have 

6T(h - b 2 , b[ - b' 2 , n) - T(b[ - b' 2 , b x - b 2 , n) = 0(e) 

for all n G B 2 . There are at least c plt $N 2 different pairs x, y G B(S,6pi) represented as 
x = b\ — b 2 , y = b[ — b' 2 , and for each of them 

6(T(x, y, n) - T(y, x, n)) = 0(e) 

for all n G B 2 . Write A C [N] 2 for the set of these pairs, thus \A\ ^ c$N 2 . Let us write 
A © A for the set of all pairs (x, y\ ± y 2 ) where both (a;, y%) and (x, y 2 ) lie in A, together 
with all pairs (x\ ± X2, y) where both (xi, y) and (x 2 , y) lie in A. By bilinearity we see 
that 

Q(T(x, y, n) - T(y, x, n)) = 0(ke) 

for all pairs (x, y) in the /c-fold bilinear sumset A © A © ■ ■ ■ © A and for all n £ B. 

Now by Lemma [B.2l this fc-fold bilinear sumset A' := A®A - ■ - ®A contains a product 
P x P provided that k ^ C$, where P is an arithmetic progression which contains and 
has length N and common differences d = 0$(1). Thus for all triples x,y, z G P R 5 we 
have 

T(x, 2) - T(y, x, z) = O(ke) + a XiVjg , (9.6) 

where u x ,y,z takes values in Z/6Z. 
Recall that we have 

\^n£[N]fi(n)f 2 (n + h)e(3T(h,n,n))e(9 h n)l Bl (n)\ > 1 

for many h G B\. By the pigeonhole principle, there are hi,ni such that 

\^ne[N]fi(n + n 1 )f 2 (n + hi + h)e(3T(h 1 + h,n 1 + n,ni + n))l PnBl (n)l J B 1 (ni + n)\ > 1 

for many h G P PI £>i. Obviously n G B(S,2pi, N), and so we may expand T(h\ + 
/i, ni + n, rii + n) using trilinearity. Doing this, absorbing the linear terms into e(6 l / l n) 
using Lemma I3"31 and expanding the cutoff ls 1 (ni + n) as a Fourier series using Lemma 
19.11 (ii), we obtain 

|E„ e[JV] /{(n)^(n + / i )e(3T(/ i) n,n))lp nBl (n)e(^n)| > 1 
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for may h G PC\B\. Here, f[(n) = fx(n+nx)e(T(hx, n,n)) = f (n + n + n 1 )e(T \[h , n + 
rii + n,n + rii + n) + T(h\, n, n)) whilst f 2 {n) = f(n + h Q + hx + n + n i)- Once again 
we drop the dashes in what follows for notational convenience. 

By the trilinearity of T and the approximate symmetry (19. 6 p of T in the first two 
variables, the genuine symmetry in the last two and another application of Lemma 13.51 
to handle the terms which are linear in n, it follows that 

\^n€[N]fi{n)e(-T(n, n, n))f 2 (n+h)e(T(n+h, n+h, n+h))l PnBl (n)e(9 h n)e(2a nAn ) | > 1, 
provided that e was chosen sufficiently small in terms of S. This, recall, is for many 

hePnB 1 . 

Now from (I9.6P and the smallness of e we see that a : (P n -B1) 3 — > Z/6Z is trilinear. 
Thus <y n ,h,n is constant as n, h vary over any translate of Q :— 6 • (P fl B\) := {Qx : x G 
PnBi}. Since PdBi may be covered by 0,5(1) such translates, we may pigeonhole yet 
again to conclude the existence of h 2 , n 2 such that 

|E n6 [Ar]/i(n + n 2 )e(-T(n 2 + n,n 2 + n, n 2 + n))F 2 (n + h)l Q (n + n 2 )e(9' h n) \ > 1. 

for many h, where 

Fi(n) := fi(n + n 2 )e(-T(n 2 + n, n 2 + n, n 2 + n)) 

= f(n + n + rii + n 2 )e(-T(n 2 + n,n 2 + n, n 2 + n) 

+ T(h , n + ni + n 2 + n, n + ni + ra 2 + n) + T{h\, n 2 + n,n 2 + n)) 

and F 2 is a 1-bounded function whose precise nature is unimportant. It follows from 
this and an expansion of 1q(?t- + n 2 ) as a Fourier series that 

E h \\F l (n)F 2 (n + h) ||* a > 1. 

Expanding out implies that the Gowers inner product (Fi, Fi, Fx, Fx, F 2 , F 2 , F 2 , F 2 )u3 is 
3> 1. By the Gowers-Cauchy-Schwarz inequality we see that ||Fi||c/3 S> 1 which, by the 
inverse theorem for the U 3 norm, implies that 

E ne[N] Fx{n)^{n) > 1 

for some 2-step nilsequence ^(n). 

Now Fx(n) is equal to f(n + no + nx +n 2 ) times a variety of bracket terms. By Lemma 
13. 6[ each of those bracket terms is a product of almost nilsequences of degree at most 
3. Thus / itself has inner product ^ 1 with a degree 3 almost nilsequence on [N]. As 
we observed in Lemma |3T3| this is enough to establish (at last!) the inverse conjecture 
for the [7 4 -norm, that is to say Theorem 11.51 □ 

Appendix A. Lifting results for nilmanifolds 

In this section we establish some slightly technical results concerning the relationship 
between points on a connected, simply-connected nilpotent Lie group G and points in 
the nilmanifold G/Y. These results were necessary in JO 

We begin with a folklore result of quantitative linear algebra type. 

Lemma A.l (Bounded equations have bounded solutions). Suppose that A is anmxn 
matrix and that b G C m . Suppose that all of the entries of A are rational numbers of 
complexity at most M, and that the entries ofb are bounded by M. Then if the equation 
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Ax = b has a solution over C n ; it has a solution in which each coordinate is bounded by 

OM,m,n(^)- 

Sketch proof. By removing rows of A if necessary we may assume that the rows of A 
are linearly independent. One may then augment A to a nonsingular n x n matrix A 
by adding appropriate basis vectors Augment b to a vector b £ C n by simply adding 
n — m zeros to b. Then the equation Ax = b has a solution given by x = A~ x b. All 
entries of x are bounded by 0M,m,n(l) by the construction of A' 1 , the key point here 
being to note that | det A| is bounded below by f^M,m,n(l) since it is a nonzero rational 
number of complexity 0Af,m,n(l)- D 

We record the following special case. 

Corollary A. 2 (Linear lifting). Suppose that V ^ MJ 1 is a vector subspace given by 
the vanishing of linear forms over Z with coefficients of magnitude at most M. Let 
7r : R n — > M. m be projection onto the first m coordinates. Suppose that the entries of 
x £ lR m are bounded by M, and that 7r _1 (x) C\V is nonempty. Then tt~ 1 (x) DV contains 
a vector whose entries are bounded by 0M,m,n(l)- 

Proof. The condition that a vector y lies in 7r _1 (x) PI V may be encoded as Ay = b, 
where this linear system includes the equations yi — x%, . . . , y m — x m and the equations 
that y must satisfy in order to lie in V. By construction the entries of A are rational 
numbers of complexity at most M and the entries of b are bounded. The corollary 
therefore follows from the preceding lemma. □ 

Using a little Lie theory, this last result has the following further corollary. 

Corollary A. 3. Suppose that G is a connected, simply- connected nilpotent Lie group 
and let n : G — > G/[G,G] be the natural projection. Suppose that the Lie algebra g = 
logG has a basis X = {Xi, . . . , X m , X m+ i, . . . , X n }, where tc(X) := {ir(Xx), . . . , 7r(A m )} 
is a basis for g/[g, g] = log(G/[G,G]) as a vector space over R. Suppose that H is an 
M -rational connected subgroup relative to X , and thatn(H) contains an element x £ M m 
whose entries, written in the basis tt(X), are bounded by M. Then there is an element 
x £ H with tt(x) = x whose entries are bounded by 0^(1). 

Proof. Let h = logH be the Lie algebra of H. By standard Lie theory (see, for example, 
[3]) the exponential/logarithm maps from g to G and from h to H are diffeomorphisms. 
The result now follows from the preceding corollary upon taking V = h. 

This last corollary took place at the level of Lie groups. The actual result we required 
in £J7] concerned lifting from nilmanifolds. We state it now. 

Proposition A. 4 (Lifting from nilmanifolds). Let G/Y be a nilmanifold with Mal'cev 
basis X = {Xi, . . . , X m , X m+ i, . . . , X n } and of complexity at most M, and let H ^ G 
be a closed connected M -rational subgroup giving rise to a subnilmanifold HT/T. Then 
there is a quantity Em > with the following property. Suppose that H[G, G]T/[G, G]T, 
identified with the torus M m /Z m using the Mal'cev basis X , contains an element x whose 
reduced coordinates (those nearest 0) are all at most em- Let ip : G — > G/[G,G]T be 
the natural projection onto the horizontal torus of G/T. Then there is a lift x £ H with 
coordinates 0^(1) whose first m coordinates are precisely the reduced coordinates of x. 
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Proof. The Mal'cev coordinates give a commutative diagram 

G/[G,G] ► R m 

(A.l) 

G/T[G,G] ► R m /Z m . 

The inclusion of H[G,G]/[G,G] into G/[G,G] identifies the former with a vector sub- 
space V ^ W 71 given by the vanishing of linear forms over Z with coefficients of mag- 
nitude O m (1), and then H[G, G}T/[G, G]T becomes identified with VZ m /Z m . Note 
that this last object is not in general connected, being a union of a finite number of 
cosets of a subtorus of M m /Z m . We claim that there is an intermediate lift x' of x to 
H[G, G]/[G, G] whose coordinates in M. m are the same as the reduced coordinates of x 
in M. m /Z m . Once this claim is proved we may use the last corollary to lift x' again, 
under the map 7r : G — > G/[G, G], thereby confirming the proposition. 

The claim is a completely abelian statement concerning tori. To prove it, suppose 
that the linear relations over Z which define V as a subspace of lR m are given by 
YujLi kij x j — 0, i — 1, . . . , m! . Suppose that em < |/cy|/10m (say) and that x, written as 
(xi, . . . , x m ) in reduced coordinates, lies in VZ, m /Z, m . By assumption we have \xj\ ^ Em 
for all j. Then Y^=i kij x j ls an integer, yet it also has magnitude at most 1/10. It must 
therefore vanish, which means that element x' G G/[G, G] whose coordinates in R m are 
precisely those of x must lie in H[G, G]/[G, G], as claimed. □ 



Appendix B. Sarkozy- type results 

In this section we prove a lemma that was used in the course of the so-called symmetry 
argument in It is a familiar principle in additive combinatorics that if one takes some 
fairly "dense" set A in an abelian group then the sumsets 2 A = A + A, 3A = A + A + A 
become progressively more structured, containing longer and longer progressions and 
ever larger Bohr sets. See, for example, [21 IH E] • Sarkozy [21] was the first to observe 
that in very high-order sumsets kA, one may locate very large amounts of structure 
indeed. The following rather neat version of his result follows directly from a theorem 
of Lev ([231 Theorem 2']): 

Theorem B.l (Lev). Suppose that A C [TV] is a set of size aN. Then for any k ^ 2/at 
the set kA — kA := A + -- - + A — A — — A contains an arithmetic progression 
{0, d, 2d, . . . , (JV - l)d} where d^ 1/a. 

In £J9] we required a kind of "bilinear" version of this. Suppose that A C [iV] 2 is 
a set. Let us write A © A for the set of all pairs (x,yi ± 1/2) where both (x,yi) and 
(x,y2) lie in A, together with all pairs (x± ± X2,y) where both (xi,y) and (x2,y) lie 
in A. The importance of this definition for us lies in the fact that if a bilinear form 
is approximately annihilated by A then it is also also approximately annihiliated by 
A® A. 

Proposition B.2 (Bilinear Sarkozy result). Suppose that A C [iV] 2 is a set of size 
aN 2 . Then for k ^ 128/a 3 the k-fold iterated bilinear sumset A © A ■ ■ ■ © A contains a 
product P x P', where P = {0, d,2d,...,(N — l)d} and P' = {0, d', 2d', . . . ,(N — l)d'} 
with < d,d' ^ 4/a 2 . 
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Proof. For each x G [N] write A x := {y G [N] : (x,y) G A} for the vertical fibre of A 
above x. By a simple averaging argument there are at least aN/2 values of x for which 
\A X \ ^ aN/2. For each such a; the vertical sumset kA x — kA x , where k ^ A/a, contains 
a progression P = {0, d x , 2d x , . . . , (N — l)d x } with < d x ^ 2/a. By the pigeonhole 
principle we may pass to a further set {A x : x G X} of vertical fibres , \X\ ^ a 2 N/A, 
which all have the same value of d x , say d. By a further application of Lev's theorem 
the set IX — IX, I ^ 8/a 2 , contains a progression P' = {0, d', 2d', . . . , (N — l)d'} with 
< d' ^ A/a 2 . □ 

Remark. We believe that it ought to be possible to prove a structural result in which 
only some bounded sum A © A © • • • © A is involved, where the number of summands 
does not depend on a (and might, for example, be 16). Such a result would deserve to 
be called a "bilinear Bogolyubov theorem" by analogy with Bogolyubov's lemma [2]. 
One would not expect to find a structure as simple and rich as the product P x P'\ we 
expect the relevant structure to be, rather, a "transverse set", the intersection of sets 
of the form {(x,y) G [N] 2 : \\6xy\\ m < e}. 



Appendix C. Structure of approximate homomorphisms 

The aim of this appendix is to indicate a proof of Proposition whose statement we 
recall now. As we said before, this result is somehow "known" without being explicitly 
given anywhere in the literature. The forthcoming Barbados lectures of the first author 
will give a self-contained treatment of results of this type. 

Proposition 18.51 (Approximate homomorphisms). Let 5, e G (0, 1) be parameters and 
suppose that fi, f 3 , f±'-S—¥ R/Z are junctions defined on some subset S C [N] such 
that there are at least 5N 3 quadruples (x±, X2, x 3 , x±) G S A with X\ + X2 = x 3 + £4 and 
||/i(xi) + f 2 (x 2 ) — fs(x 3 ) — /4(x4)||r/z ^ £■ Then there is a bracket linear phase ip : Z — > 
R/Z of complexity 0$(1) and a set S' C 5, \S'\ ^$ iV ; sitc/j i/iai fi(x) = ip(x) + 0(e) 
for all 1 G 5'. 

Proof. We begin with a "rounding" trick to dispose of the error of e in the range. Take 
N := [1/e] and for 2 = 1,2,3,4 define fi:S—> R/Z by taking f%(x) = r/N, where r, 
^ r < N, is the integer such that r/iV is nearest to fi(x) in R/Z (ties being broken 
arbitrarily). Then of course fi(x) = fi(x) + 0(e) for all x G 5 and so 

h(xi) + /fe) - 73(2:3) - h(x 4 ) = 0(e) 

for the set of additive quadruples (xi, x 2 , x 3 , x 4 ) G S" 4 in the hypothesis of the proposi- 
tion. The quantity + ^2(^2) _ ^3(^3) — fi(x4)\\R/z is quantised and restricted to 
integer multiples of 1/N, and there are only 0(1) such numbers with magnitude 0(e). 
It follows that there is some 9 such that ||7i(^i) + f 2 (x 2 ) — fs(x 3 ) — /iO^lk/z = for 
cSN 3 additive quadruples (xi, X2, x 3 , x 4 ) G S 4 , where f' A (x) = f±(x) + 9 . 

Writing V t := {(x,fi(x)) : x G S} C Z x R/Z, i = 1,2,3, and := {(a^O)) : 
x G 5} C R/Z for the "graphs" of /1, / 2) /3 an d J4, this means that the additive 
energy (cf. [25J Chapter 2]) E(Fi, T 2 , T 3 , T^) is at least c<5iV 3 . By [251 Corollary 
2.10] (or the Cauchy-Schwarz-Gowers inequality) it follows that the additive energy 
E(Ti, Ti, Ti, Ti) is at least c5 c N 3 , or in other words that there are ^ cS c N 3 additive 
quadruples (x 1 , x 2 , x 3 , x 4 ) G S* 4 for which ||7i(^i) + fi(x 2 ) - fi{x 3 ) ~ 7i(2u)||r/z = 0. 
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From this point on we give references to the paper [TT] of the first two authors, 
which is reasonably well-adapted to our purposes. Most of the ideas here go back to [El 
Chapter 7] and to earlier work of Ruzsa. Starting from the assumption that the graph 
T has large additive energy, the key steps are the following^. 

(i) [HI Proposition 5.4] Apply the Balog-Szemeredi-Gowers theorem followed by 
the Plunnecke- Ruzsa inequalities to conclude that there is a set S C S, |S | ^ 
cS N, such that the graph T := {(x,j\(x)) : x G So} satisfies an iterative 
sumset estimate \kT — IT\ <Cfc 5 z N for all integers k, I ^ 1. 

(ii) [HI Proposition 9.1] The function fi correlates with a function which is locally 
linear on a Bohr set. This means that there are is a Bohr set B = B(Q, p, N) 
with 6 = . . . , 9 d } C M./Z, d = Os(l) and p 3><5 1 together with a function 
</> : £> — t- R/Z satisfying (p(x+y) = 4>(x)+(f)(y) whenever x, y, x+y G B(Q, p, N), 
as well as some xq G [N] and some 9q G K./Z such that f(x + xo) = 9q + 4>(x) for 

iV values of x G (So — xo) H B. The appropriate definitions here are given 
in full in [11] and are also recalled in £j9]of the present paper. 

(iii) Apply some geometry of numbers to conclude that any such linear function 
has the form 4>(x) = ai{8ix} + • • • + a^O^x} + 9x on some multidimensional 
progression P C B with \P\ N. The proof of this is very similar to, but 
easier than, that of [HI Proposition 10.8], where an analogous statement is 
established for locally quadratic phase functions on Bohr sets. 

It follows from all of this that we have 

fi(x) = + aii{0i(:r - x )} H h a d {9 d (x - x )} + 9(x - x ) 

for all x in some set Si C So, |Si| 3><5 N. 

Now we have {9j(x — x )} = {9jx} — {9jXo} + Tj^ x , where Tj jX takes values in { — 1, 0, 1}. 
By the pigeonhole principle we may pass to a further subset S2 C Si with IS2I ^>s N 
such that, for all x G S2, each of the r^ x is independent of x. 

Take S' := S2. Then for x G S' we have 

fi(x) = 9' + ai{9 x x} H h a d {9 d x} + 9x, 

a bracket linear form of complexity d = 0$(1). Recalling that f\{x) = fi(x) + 0(e), the 
result follows. □ 

Remark. The rounding trick we used to remove the e errors was a slightly dirty one 
but makes the argument quite short given known results. It would probably be possible, 
and more natural in some moral sense, to run through the Balog-Szemeredi-Gowers and 
Freiman arguments carrying an 0(e) error throughout. 

Appendix D. Some diophantine results 

This section recalls some well-known results from Diophantine approximation which, 
in the context of this paper, may be naturally viewed as distributional results for abelian 



Strictly speaking, the tools we are applying here only apply to groups rather than to intervals such as 
[N]. However, this can be easily addressed by temporarily embedding [N] in, say, Z/KWZ; we omit 
the details. 
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(1-step) nilsequences. We will use them repeatedly in the next section. Furthermore 
Lemma [D.2I below was crucial in §7J and Lemma ID. II was required at the end of §EJ 

Lemma D.l. Let d ^ 1 be an integer, let e £ (0,1/2) be a parameter, and suppose 
that ip{n) = a^n d + ■ — \- a is a polynomial of degree d such that (ip(n)(mod l)) n6 [jv] is 
not e-equidistributed on M/Z. Then for all i = d, d — 1, . . . , 1 there are coprime integers 
ttiili, Qi ^ e~ Cd , such that 



e~° d . 



qi W 

Proof. This is actually a special case of the Quantitative Leibman Dichotomy, Theorem 
I4.1| although this is a somewhat misleading statement to make since it is also a crucial 
ingredient in the proof of that result. It is proven using Weyl's criterion for equidis- 
tribution and Weyl's inequality (see, for example, 127]), and indeed the statement that 
the lead coefficient is close to rational is essentially equivalent to that inequality. 
The other coefficients a d -i, ct d _ 2 ■ ■ ■ m &y be shown to be almost rational iteratively; the 
argument is given in detail in [TJJ §4]. □ 

Secondly we recall a quantitative version of Kronecker's theorem, phrased in lan- 
guage appropriate to §[7J Once again this is a special case of the Quantitative Leibman 
Dichotomy, and once again it is very well-known. 

Lemma D.2. Let d ^ 1 be an integer, let e £ (0,1/2) be a parameter, and let 
ad £ ffi/Z be frequencies. Suppose that ((ain, . . . , «dn)(mod l)) ne [jv] fails to 
be e-equidistributed in the torus (IR/Z) d . Then the set {ai, . . . , a<z} satisfies an e~ Cd - 
linear relation up to e~ Cd /N (that is, there are integers mi, . . . ,m<i, not all zero, with 
^ £~ Cd for all i and \\miai + • • • + m^a^ ||r/z ^ s~° d /N). 



Proof. This is discussed in detail in p31 §3]. Here is a very rough sketch: if the sequence 
is not e-equidistributed, there is some Lipschitz function F : (R/Z) d — > C with 

\E ne[N] F(a in , . . . , a d n) - [ F(9)d6\ ^ s\\F\\ Lip . 

J(R/Z) d 



Lipschitz functions are well-approximated in L°° by their Fourier series; exanding F 
into such a series, it follows that some exponential sum 

E ne[ Ar]e(m • an) 

has modulus at least e° d , where rh = (mi, . . . ,md) and \rrii\ ^ e~ Cd . The lemma now 
follows with an application of the formula for the sum of a geometric series. □ 



Appendix E. Almost nilsequences 

The aim of this section is to establish Lemmas I3.5l andl 3.6| which asserted that various 
objects - chiefly bracket polynomials - are 1-, 2- and 3-step almost nilsequences. 

Lemma 13.51 Suppose that a, (3 £ [0, 1] and that M > 1 is a complexity parameter. The 
following are all examples of almost nilsequences of degree 1 and complexity 0^(1); 

(i) the set of 1-step Lipschitz nilsequences of complexity at most M; 

(ii) the set of characteristic functions lp, where P C [N] is a progression of length 
at least N/M; 
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(iii) the set of functions of the form n i— > e(a{/3n}), with a6l and (5 G R/Z; 

(iv) the set of functions of the form n i— >■ e({cm}{/3n}) 7 to</i a, (3 G R/Z; 

(v) i/ie set of functions of the form n h-» e(an|_/3nj), where ||/3||ir/z ^ M/N. 

Proof, (i) is trivial. 

To prove (ii) we first note that lp(n) can be expressed as the product of lj(n) and 
l n = a ( mo d g ), where I C AT/M is an interval and g ^ M. The second object is in fact 
a 1-step nilsequence _F(g(n)r) on R/Z, the polynomial sequence (7 : Z — > R being 
g{n) = n/q and the function F : R/Z — > [0, 1] being Lipschitz, equal to 1 at a/q and 
supported within l/10g (say) of a/q. The first object, is not quite a genuine 1- 

step nilsequence. However let us observe that any function ip : [N] — > C with Lipschitz 
constant 0(l/iV) is a genuine 1-step nilsequence; indeed we have ip(n) = F(g(n)T) on 
R/Z, where g(n) = n/2N and F : R/Z — >■ C is defined by setting F(n/2N) := ip(n) 
for n G [A/ - ] and by Lipschitz extension elsewhere. Now simply note that 1/ may be 
approximated arbitrarily closely, in L^A/], by functions ip of this type. Specifically, we 
may take a sequence of Lipschitz "tent" functions ip which equal 1 on J and are zero at 
points distance more than eN from J. The claim now follows from Lemma 13.21 

To establish (iii) we first note that if a = a'(mod 1) then a{(3n} = a'{f3n} + (a — 
<y)/3n(mod 1), and so we may assume that ^ a ^ 1. Let e > be arbitrary and define 
F : R/Z — > C by F(x) = e(a{x}) and divide into two cases: either (/3n(mod l)) ne [jv] is 
e/10-equidistributed on R/Z, or it is not. In the former case we take a lOOe-Lipschitz 
function F which agrees with F outside of the set {x G R/Z : ||:e||r/z ^ £ /10} 
and is bounded by 1 elsewhere. By the assumed equidistribution we obviously have 
e(a{j3n}) = F(j3n) = F(f3n) for all except at most eN/2 values of n. The result is then 
immediate. 

If, on the other hand, the sequence (/3n(mod l)) n e[iV] fails to be e/10-equidistributed 
then by Lemma ID. II with d — 1 there is an integer q ^ e~ c and an a G Z such that 

— - 1| r/z e~ c /N. This in turn means that we may divide [N] into progressions 
Pi U • • • U P m , m e~ c , on which n e(a{(3n}) varies by at most e/100. Since (by 
part (ii)) functions which are constant on progressions are almost 1-step nilsequences, 
the result follows (using Lemma 13.21 as necessary) . 

To prove (iv) we use a trick. The function (x, y) h-> e(xy) on the square [0, l] 2 may be 
smoothly extended to a periodic function on [0, 2] 2 . By Fourier analysis (cf. |J2i Lemma 
A. 9]) it may then be uniformly approximated to any desired accuracy e by a linear com- 
bination of frequencies e((kx+ly)/2), k, I G Z. Thus e({an}{Pn}) may be approximated 
uniformly by a linear combination of functions of the form e(k{an} /2)e(l{(3n} /2) . But 
such functions are almost 1-step nilsequences by (iii), and the claim follows from Lemma 

Finally we turn to (v). The condition that ||/3||r/z ^ M/N means that we may divide 
[N] into subprogressions (in fact sub intervals) P\ U ■ ■ ■ U P m , m = Ojf(l), such that 
|_/3nJ is equal to some constant Cj for n G Pj. The result then follows from (ii) and 
Lemma 13 .21 □ 

Now we turn to higher degree bracket polynomial phases. 

Lemma 13. 6L Suppose that a, (3, 7 G [0, 1] . Then the following are all examples of almost 
nilsequences of degree s ^ 2 and complexity 0(1).' 

(i) n 1 — y e([an\(3n) , of degree 2; 
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(ii) n i — y e([an\(3n 2 ) , of degree 3; 

(iii) n i — y e([an\ \_f3n\^n), of degree 3. 

Proof. The proofs of all three parts are somewhat similar and proceed along the following 
lines: each object may be exhibited in a fairly obvious way as a nilsequence F(g(n)Y), 
where F is, however, only piecewise Lipschitz. If the sequence (g(n)Y) n€ ^ is highly 
equidistributed then it spends sufficient time away from singularities for one to be able 
to approximate by F(g(n)Y), where F is genuinely Lipschitz. If not then there must 
be an approximate rational relation between the horizontal frequencies of g(n) (that 
is, the frequencies occurring in the projection to G/[G,G}). This may then be used to 
approximate the object in question by objects of lower complexity. 

To exhibit these arguments as part of a more general theory is not a particularly 
easy matter and involves a more conceptual understanding of bracket identities such as 
those in Lemma 15.51 and others such as (1E.3|) below. The required theory is implicit in 
the work of Leibman [22] arid will be introduced properly in our longer paper to come. 

In this paper we can proceed in an ad hoc and slightly calculational way, taking 
advantage of one or two simplifications specific to the f/ 4 (3-step) case. In a sense, 
however, these calculations also serve as motivation for the longer paper to come. We 
begin by recalling the constructions of pleading up to (15. 2p . Specialising to the free 
2-step nilpotent group on two generators (essentially the Heisenberg group) we have 

F[i,2](g(n)T) = e([an\/3n) 

and 

F [1<2] (g'(n)Y)=e([an\(3n 2 ) 

where -^1,2] : G/Y — > C is the basic coordinate function introduced in Definition 15.31 and 
g,g' : Z — y G are polynomial sequences of degree 2 and 3 respectively given in coor- 
dinates by g(n) = (an, — (3n, 0), g'(n) = (an, — (3n 2 , 0). Only the first two coordinates 
(corresponding to the horizontal torus G/[G,G}) are really important. 

The discontinuities of -^[1,2] are very manageable: the key point, already exploited 
in §3 is that for any e > there is are e -C, -Lipschitz functions F : G/Y — > C and 
* : G/Y -> [0, 1] such that J G/r * < e and \F(x) - F(x)\ < ^(x) pointwise. 

Fix e > 0. Let us consider statement (i), for which we consider the sequence 
(p(n)r) n6 [jv]. If it is e-equidistributed then, by the preceding, e({an}(3n) = F(g(n)Y) 
and F(g(n)Y) are within 2e in L l [N]. If this is not the case then, by the Quantitative 
Leibman Dichotomy (Theorem 14.11) there must be some 0(e~ c )-lmea.r relation, up to 
0(e~ c /N), between a and /3. The rest of the argument in this case is essentially iden- 
tical to that at the very end of §7J we may find some 7 such that a = qij + 0(e~ c /N) 
and (3 = g 2 7 + 0(e~ c /N), where qi, q 2 are integers with magnitude at most e~ c . Sub- 
stituting into e(an\_f3n\) and making repeated use of the bracket identities of Lemma 
15.51 as well as Lemma [33J one sees that in this case e(an\_(3n\) lies within e in -^[./V] of 
a degree 2 nilsequence (of step 1) of complexity £ (1). Thus in either case we have ap- 
proximated e(an\_f3n\) within 0(e) by a degree 2 polynomial nilsequence of complexity 
O e (l), thereby completing the proof of (i). 

The analysis of (ii) is similar but, obviously, involves consideration of the sequence 
(g'(n)r) ng [7v] instead. If the sequence (g'(n)Y) n€ ^ is e-equidistributed then we are 
done, as before. If not, the Quantitative Leibman Dichotomy implies that either a = 
— + 0(e~ c /N) or else j3 = — + 0(e~ c /N 2 ). In the first case we may then partition 
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[N] into progressions Pi U ■ ■ • U P m , m = O e (l), on which [an\ is constant and then 
apply Lemma I3"751 (ii) to approximate e{\an\f3n 2 ) within 0(e) by a degree 2 polynomial 
nilsequence of complexity O e (l). In the second case we first apply the bracket identity 
(15. 3p to write 

e([an\/3n 2 ) = e(a^n 3 )e(-an[Pn 2 })e(-{an}{/3n 2 }). (E.l) 

The first term here is already a degree 3 polynomial nilsequence of complexity 0(1). 
In the second term we may partition [N] into progressions Pi U • ■ ■ U P m , m = O e (l), 
on which [fin 2 ] is constant and then apply Lemma 13.51 (ii) to approximate arbitrarily 
closely by a degree 1, nilsequence. The third term, e(— {an}{(3n 2 }), may be handled 
using the same trick as in the proof of Lemma 13751 (iv). This reduces matters to handling 
e(9{9'n}) (already known to be a degree 1 almost nilsequence by Lemma 13751 (iii)) and 
e(6{6'n 2 }). By an argument almost identical to that used in the proof of Lemma 13.51 
(iii), only using Lemma lD.ll with d = 2 instead, this second object may be shown to 
be an degree 2 almost nilsequence. Using Lemma 13.21 to put everything together, we 
obtain the claim. 

We turn now to the proof of (iii), which is important in the sense that it is the only 
place in our paper where a 3-step nilmanifold is actually constructed! 

Specifically, we let g be the free 3-step Lie algebra generated by three generators 
ei, e2, e3, or equivalently 

G :={$44 

t 2 l t 2 ll t 31 t 311 t 32 t 32 2 t 2 12 t 312 t 2 l 3 t 313 t 323 .4. 4. 4. ~ TO 1 <•-• A U < O \ 
e 21 e 211 e 31 e 311 e 32 e 322 e 212 e 312 e 213 e 313 e 323 ■ fc U > L tjk t M, 1 ^ ^ OJ-. 

subject to the relations e^ 1 ej 1 ejej = [e^e,] = e^-j for 1 ^ j < i ^ 3, [[e», e 3 -], e*.] = e^, 
and the Jacobi relation [[e^, ej], e^] [[e.,-, e^], e,] [[e^, e*], e 3 -] = 1. Inside G we take the 
standard lattice 

r :={e?e?e? 

«2l n 2 n n 3 i n 3 n n 3 2 "322 n 2 i2 n 3 i2 n 2 i3 n 3 i 3 n 323 p j i <• • jL <- q\ 

e 21 e 211 e 31 e 311 e 32 e 322 e 212 e 312 e 213 e 313 e 323 • ,L V> ,b vk t ^1 1 =*S L i Ji «-j ^ °J- 

Then G/T is the free 3-step nilmanifold on 3 generators. We take G, to be the lower 
central series on G 

We abbreviate e* 1 . . . e^f as . . . , £323). A computation yields the multiplication 
law 

(ti, . . . , £ 323 ) * (tti, • • • , M323) = (si, • • • , s 323 ) 

where Si = ti + Uj for i = 1,2,3, Sjj = t^- + + t%v,j for 1 ^ j < i ^ 3, and 
•S312 — ^312 + ^312 + £32^1 + ^31^2 + we will ignore the other coordinates, as they 

will not be needed in this calculation. 

Using this law, we see that for any real numbers £1, . . . , £323, one has 



where 

Si 



S312 



(h, . . . ,t 3 23)r = (si, . . . , s 323 )r 

= {U} for i = 1, 2, 3; 

= {t ij -t i [t j ]}iovl^j<i^3; 

= {tsi2 - h 2 [t x ] - t 31 [t 2 ] + hMh]}, (E.2) 



AN INVERSE THEOREM FOR THE COWERS f/ 4 -NORM 



45 



and with the other coordinates s^k £ [0, 1] being explicitly computable, but not relevant 
for this discussion. Thus if we let 

g(n) := e^ef ef 1 

and let F : G/Y — > C be the 3-step basic coordinate function function 

F((s 1 ,...,s 323 )Y) := e(s 312 ) 

for si, . . . , S323 £ [0, 1], then one sees that e(|_cmj [fin^n) is equal to F(g(n)Y) times 
objects already known to be almost nilsequences by earlier parts. 

This concludes the argument unless (g(n)Y) n ^ N ] spends too much time near the 
singularities of F, which are at the points Sj = and Sj = 1, j = 1,2,3. There 
will be no problem unless^] one of the sequences (an(mod l)) n6 nv], (/3n(mod l)) n e[N], 
(7n(mod l)) n6 [jv] fails to be e-equidistributed. If (an(mod l))„ g nv] is not £-equidist- 
ributed then, by the now-familiar application of Lemma ID. II with d — 1, we may 
partition [N] as a union Pi U • • • U P m of at most e~ c progressions such that [an\ is 
constant on Pj. We may then conclude using part (i) and Lemma [3.51 (ii). An identical 
argument works if (/3n(mod l)) n e[iv] fails t° b e e-equidistributed. 

The final case is when (7n(mod l)) ne [jv] fails to be £-equidistributed. In this case we 
note that 

{an}{(3n}{ , yn} = (an — \_an\)((3n — [/3^J)(7^— \sf n \) 

so that 

e(|_anj [fin^n) = e({an}{(3n}{'yn})e(—[an\f3n['yn\)x 

x e(— an[(3n\ [jn\)e(a(3n 2 [^n\)e(a^n 2 [(3n\)e((3^n 2 [an\) . 

Each of the terms on the right except the first can be handled using part (ii) or by those 
instances of part (iii) already established. To deal with the first term e({an}{/3?7.}{7n}) 
one may proceed exactly as in Lemma 13.51 (iv) to show that this is in fact an degree 1 
almost nilsequence. Applying Lemma 13.21 to collect terms, we obtain the claim. □ 

The main business of the paper is now concluded. The remaining two appendices 
were promised in the introduction but are not necssary for the proof of Theorem 11.51 

Appendix F. The strong inverse conjecture 

We have shown, in Theorem ll.5[ that a 1-bounded function / : [N] — > C with 
||/||;74 ^ 5 correlates with a degree 3 polynomial nilsequence F(g(n)Y). As we remarked 
after the statement of Conjecture 11.31 this does not quite establish the result used in (for 
example) [13], where correlation with a nilsequence F(g n xY) was used. In this section 
we shall refer to linear nilsequences to distinguish objects of this last type from more 
general polynomial nilsequences. 

In this section we indicate, very briefly, how our arguments may be modified to obtain 
this apparently stronger statement. In the longer paper to come we will provide a quite 
general proof that Conjecture 11.31 implies this strong variant. Let us recall once more, 
however, our view that this is the "wrong" perspective and that [13] works, with rather 
minimal changes, in the context of polynomial nilsequences. 

10 This observation, which is stronger than saying that the abelianization ((an, /3n, jn) (mod l)) n e[N] 
C (R/Z) 3 is not equidistributed, is somewhat specific to the 3-step situation we are working with and 
represents something of a simplification over the argument required in general. 
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We need only show that large f/ 4 -norm entails correlation with almost linear nilse- 
quences, defined in exact analogy with Definition 13.11 We already have correlation 
with almost polynomial sequences, so it will suffice to show that the almost polynomial 
sequences used in the paper are also almost linear sequences of the same degree. 

Clearly, any degree 1 almost nilsequence is already an almost linear 1-step nilse- 
quence, and an inspection of the previous appendix shows that e([an\(3n) is an almost 
linear 2-step nilsequence, modulo a quadratic phase e^n 2 ), and similarly e([anj \_f3n\^jn) 
is an almost linear 3-step nilsequence modulo phases such as e(7n 3 ) and e(\jn\d~n 2 ). As 
Lemma 13.21 is clearly also valid for almost linear nilsequences, one only needs to verify 
three remaining claims, for any real numbers a, f3: 

• e(an 2 ) is an almost linear 2-step nilsequence; 

• e(an 3 ) is an almost linear 3-step nilsequence; and 

• e(\_an\f3n 2 ) is an almost linear 3-step nilsequence. 

We look first at e(an 2 ) and consider once again the 2-step nilpotent group on 2 
generators (Heisenberg group); looking all the way back to (15.21) and taking g = (2a, 1, 0) 
one may compute that F\ lt2 ](g n T) = e(an 2 + On) for some G 1R/Z. Now -F[ li2 ](ti, t 2 , t 12 ) 
is discontinuous when t\ 2 = or 1. If we wish to approximate e(an 2 + On) within e (in 
L 1 [iV]) by a Lipschitz linear nilsequence, we must show (for example) that there are no 
more than lOeN values of n 6 [N] for which an 2 + On(mod 1) is within e of 0. But if 
this is not the case then, by Lemma TP. lj we have a = a/q + 0(e~ c /N 2 ), at which point 
we can split [N] into e~ c progressions on which e(an 2 + On) is within 0(e) of a linear 
phase. One may then proceed using Lemma 13.51 

Now we turn to the 3-step objects e(an 3 ) and e( \_an\f3n 2 ), which require some slightly 
more careful calculations on the free 3-step nilmanifold are required. With the notation 
for the free 3-step nilpotent Lie group as in the preceding section, let g = e^efe^. Then 
one can check that 

n n _ nanB^fihPfihyfi)^ a/3 7 (2( S )+(»)) 
g — e 1 e 2 e 3 e 21 e 31 e 32 • • • e 312 

and hence one may compute (cf. (IE.2[) ) 

F 312 (g n T) = e(ah (2© + Q)) - Q)Pl[an\ - Q)o^[fin\ + rvy[an][fin\) (F.l) 

Taking — j — 1 and replacing a by 6a gives .F 3 i 2 (g"T) = e(an 3 + q(n)) for some 
quadratic q. The discontinuities of F 312 may be handled as for F[ 12 ] above, and so we 
see that e(an 3 + q(n)) is an almost 3-step linear nilsequence for some quadratic q. Since 
we can already obtain pure quadratic and linear phases as almost linear nilsequences of 
step less than 3, it follows that e(an 3 ) itself is an almost 3-step linear nilsequence. 

Next, we take (3 = 1 and replace 7 by —27. Taking into account objects already 
known to be almost linear nilsequences, we have now obtained e(^n 2 [na\) as a 3-step 
almost linear nilsequence. Applying (IE.1I) . we see that to obtain the desired object 
e([na]^n 2 ) it suffices to examine e({an\{^n 2 }). By the trick used in the proof of 
Lemma 13.51 (iv), it suffices in turn to handle e(0{0'n}) and e(0{0'n 2 }). The first of 
these is an almost 1-step (linear) nilsequence by Lemma 1331 (iii) . To handle the second, 
proceed in the same way as in the proof of Lemma 13.51 (iii) but in the obvious places 
substitute the fact (established above of course) that pure quadratic phases are 2-step 
linear nilsequences, together with the case d = 2 of Lemma ID. 11 □ 
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Appendix G. Necessity of the inverse conjectures 

In this appendix we sketch a rather short proof of Proposition ll.4[ which asserted 
that functions which correlate with a degree s polynomial nilsequence must have large 
£/ s+1 -norm. Since linear nilsequences are merely special cases of polynomial ones, this 
kind of argument could substitute in, for example, [131 Sec. 10], where a rather more 
complicated approach was taken. 

Proposition [T~4l Suppose that f : [N] — > C is a l-bounded function, that (F(g(n)Y)) nG z 
is a polynomial nilsequence of degree s and complexity 0$(1), and that 

\E ne[N] f(n)F(g(n)Y)\ > 6. 

Then \\f\\u^ ><5 1- 

Sketch proof. The argument is only a sketch in that we do not address such issues 
as the complexity of the nilsequences involved. We leave this as a (not particularly 
interesting) exercise to the reader, most of the details of which may be found in [H] 
where these complexity issues are discussed in detail. We proceed by induction on s, 
the claim being obvious when s = 0. Let / : [N] — > C be a l-bounded function, and 
let g : Z — >• G be a polynomial sequence of degree s adapted to the filtration G,. Let 
F(g(n)Y) be a polynomial nilsequence of complexity 0,5(1). Assume that 

|E n6[iV] /(n)F^(n)r)|> 5 l. (G.l) 

By decompositing F into vertical characters as in [TU Lemma 3.7], we may assume that 
F has a vertical frequency: that is, there is some nontrivial character £ : G( s ) / G( s ) fl T 
such that 

F{g s x) = e(£(g a ))F(x) 

for all g s G G( s ) and x G G/Y. 

By taking the modulus squared of fIG.ip and making the substitution n' = n + h we 
see that 

E ne[N] A h f(n)F{g(n + h)Y)F(g{n)Y) > 5 1 

for > 5 N values of h G [N). 

However for each fixed h the "derivative" n 1— > F(g(n + h)Y)F(g(n)Y) of the degree 
s-step nilsequence F(g(n)Y) is a Lipschitz polynomial nilsequence of degree (s — 1), the 
underlying nilmanifold being 

W) = (Gx Gw G)/Gf s) , 

where G x G{2) G = {{g,h) : g,he G^gh' 1 G G (2 )}, and Gf = {(g s ,g s ) ■ g s G G {s) }. For 
details of this theory see Section 7 of [H] . 

We now invoke our induction hypothesis to conclude that 

l|A h /|k » 5 1 

for N values of h. 
Noting that 

Il/lli7 s+ i =^hez/N'z\\^hf\\u 3 , 
we are done. □ 
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It is perhaps worth reiterating the main point of the above argument, since it explains 
the importance of nilsequences in the whole theory: the derivative of a degree s poly- 
nomial nilsequence with a vertical character is a degree (s — 1) polynomial nilsequence. 
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